python - Conditional transform on pandas -
i've got simple problem, can't seem right. consider dataframe
df = pd.dataframe({'group' : ['a', 'a', 'a', 'b', 'b'], 'time' : [20, 21, 22, 20, 21], 'price' : [3.1, 3.5, 3.0, 2.3, 2.1]}) group price time 0 3.1 20 1 3.5 21 2 3.0 22 3 b 2.3 20 4 b 2.1 21
now want take standard deviation of price of each group, conditional on being before time 22 (let's call early_std
). want create variable information.
the expected result is
group price time early_std 3.1 20 0.282843 3.5 21 0.282843 3.0 22 0.282843 b 2.3 20 0.141421 b 2.1 21 0.141421
this tried:
df['early_std'] = df[df.time < 22].groupby('group').\ price.transform(lambda x : x.std())
this works gives missing value on time = 22
:
group price time early_std 0 3.1 20 0.282843 1 3.5 21 0.282843 2 3.0 22 nan 3 b 2.3 20 0.141421 4 b 2.1 21 0.141421
i tried apply , think works, need reset index, i'd rather avoid (i have large dataset , need repeatedly)
early_std2 = df[df.time < 22].groupby('group').price.std() df.set_index('group', inplace=true) df['early_std2'] = early_std2 price time early_std early_std2 group 3.1 20 0.282843 0.282843 3.5 21 0.282843 0.282843 3.0 22 nan 0.282843 b 2.3 20 0.141421 0.141421 b 2.1 21 0.141421 0.141421
thanks!
it looks need add fillna()
first code expand std
values:
df['early_std'] = df[df.time < 22].groupby('group')['price'].transform(pd.series.std) df['early_std'] = df.groupby('group')['early_std'].apply(lambda x: x.fillna(x.max())) df
to get:
group price time early_std 0 3.1 20 0.283 1 3.5 21 0.283 2 3.0 22 0.283 3 b 2.3 20 0.141 4 b 2.1 21 0.141
edit: have changed ffill
more general fillna
, use chained .bfill().ffill()
achieve same result.
Comments
Post a Comment