python - Sorting a DataFrame such that NA values on the first sort column would be at the end regardless of the secondary sort columns -
i using dataframe.sort
, aiming default behavior of pushing na values end.
the problem add secondary sort columns, na values on first sort column don't behave non-na values. apparently if have na in first column, overridden secondary columns if aren't na.
for example:
in [1]: df = dataframe([[1, 1], [none, 0]]) in [2]: df.sort([0]) out[2]: 0 1 0 1 1 1 nan 0 in [3]: df.sort([0, 1]) out[3]: 0 1 1 nan 0 0 1 1
the last sort demonstrates undesirable behavior: value on first sort column (0) nan, record 1 should @ end. it's not, because apparently second column (1) takes precedence.
is there way sort df
such secondary sort column used resolve equality among first sort column, while still keeping nas @ end, regardless of secondary column value?
apparently bug fixed. using pandas 0.13.1. upgrading 0.16.1 produced desired behavior:
in [4]: df.sort([0, 1]) out[4]: 0 1 0 1 1 1 nan 0
Comments
Post a Comment