r - Efficient Way to Incrementally Count Unique Data Points in Data Frame -

September 15, 2012

i trying find more efficient way incrementally count unique data points in data frame.

for example, have following code written:

df = matrix(c(1,2,3,3,4,5,1,2,4,4))  count = matrix(nrow = nrow(df),ncol=1)  (i in 1:nrow(df)) {    count[i,1] = length(which(df[1:i,1] == df[i,1]))  }

the purpose of code incrementally count each instance of specific value, e.g. count column have following result:

1,1,1,2,1,1,2,2,2,3.

the code have written far job, sample df above contains 10 values. real data frame trying perform function on contains 52,118 values, takes enormous amount of time.

does know of more efficient way execute code above?

data.table solution

library(data.table) set.seed(20) dat  <-data.frame(values = sample(1:3, 50000, replace=true)) setdt(dat)[,runningcount:=1:.n,values]         values runningcount     1:      3            1     2:      3            2     3:      1            1     4:      2            1     5:      3            3    ---                     49996:      1        16674 49997:      2        16516 49998:      2        16517 49999:      2        16518 50000:      2        16519

Search This Blog

Lix

r - Efficient Way to Incrementally Count Unique Data Points in Data Frame -

Comments

Post a Comment

Popular posts from this blog

javascript - three.js lot of meshes optimization -

smartface.io - Proper way to change color scheme for whole application -

Email notification in google apps script -