Python Pandas Create New Column with Groupby().Sum() -


trying create new column groupby calculation. in code below, correct calculated values each date (see group below) when try create new column (df['data4']) nan. trying create new column in dataframe sum of 'data3' dates , apply each date row. example, 2015-05-08 in 2 rows (total 50+5 = 55) , in new column have 55 in both of rows.

import pandas pd import numpy np pandas import dataframe   df = pd.dataframe({'date': ['2015-05-08', '2015-05-07', '2015-05-06', '2015-05-05', '2015-05-08', '2015-05-07', '2015-05-06', '2015-05-05'], 'sym': ['aapl', 'aapl', 'aapl', 'aapl', 'aaww', 'aaww', 'aaww', 'aaww'], 'data2': [11, 8, 10, 15, 110, 60, 100, 40],'data3': [5, 8, 6, 1, 50, 100, 60, 120]})  group = df['data3'].groupby(df['date']).sum()  df['data4'] = group 

you want use transform return series index aligned df can add new column:

in [74]:  df = pd.dataframe({'date': ['2015-05-08', '2015-05-07', '2015-05-06', '2015-05-05', '2015-05-08', '2015-05-07', '2015-05-06', '2015-05-05'], 'sym': ['aapl', 'aapl', 'aapl', 'aapl', 'aaww', 'aaww', 'aaww', 'aaww'], 'data2': [11, 8, 10, 15, 110, 60, 100, 40],'data3': [5, 8, 6, 1, 50, 100, 60, 120]}) ​ df['data4'] = df['data3'].groupby(df['date']).transform('sum') df out[74]:    data2  data3        date   sym  data4 0     11      5  2015-05-08  aapl     55 1      8      8  2015-05-07  aapl    108 2     10      6  2015-05-06  aapl     66 3     15      1  2015-05-05  aapl    121 4    110     50  2015-05-08  aaww     55 5     60    100  2015-05-07  aaww    108 6    100     60  2015-05-06  aaww     66 7     40    120  2015-05-05  aaww    121 

Comments

Popular posts from this blog

Email notification in google apps script -

c++ - Difference between pre and post decrement in recursive function argument -

javascript - IE11 incompatibility with jQuery's 'readonly'? -