python - How do I put a series (such as) the result of a pandas groupby.apply(f) into a new column of the dataframe? -
i have dataframe, want calculate statitics on (value_count, mode, mean, etc.) , put result in new column. current solution o(n**2) or so, , i'm sure there faster, obvious method i'm overlooking.
import pandas pd import numpy np df = pd.dataframe(np.random.randint(10, size=(100, 10)), columns = list('abcdefghij')) df['result'] = 0 groups = df.groupby([df.i, df.j]) g in groups: icol_eq = df.i == g[0][0] jcol_eq = df.j == g[0][1] i_and_j = icol_eq & jcol_eq df['result'][i_and_j] = len(g[1])
the above works, extremely slow large dataframes.
i tried
df['result'] = df.groupby([df.i, df.j]).apply(len)
but doesn't seem work.
nor does
def f(g): g['result'] = len(g) return g df.groupby([df.i, df.j]).apply(f)
nor can merge resulting series of df.groupby.apply(lambda x: len(x))
you want use transform
:
in [98]: df['result'] = df.groupby([df.i, df.j]).transform(len) df out[98]: b c d e f g h j result 0 6 1 3 0 1 1 4 2 8 6 6 1 1 3 9 7 5 5 3 5 4 4 1 2 1 5 0 1 8 1 4 7 3 9 1 3 6 8 6 4 6 0 8 0 6 5 6 4 7 9 7 2 8 9 9 6 0 6 7 5 3 5 5 7 2 7 7 3 2 8 3 6 5 0 4 7 5 7 5 7 9 1 5 7 3 2 5 4 3 6 8 4 2 0 3 8 2 3 0 4 8 5 7 9 7 2 2 9 1 1 3 2 3 5 6 6 5 6 1 10 3 0 2 7 1 8 1 3 5 4 3 ....
transform
returns series index aligned original df can add column
Comments
Post a Comment