r - Efficient Way to Incrementally Count Unique Data Points in Data Frame -
i trying find more efficient way incrementally count unique data points in data frame.
for example, have following code written:
df = matrix(c(1,2,3,3,4,5,1,2,4,4)) count = matrix(nrow = nrow(df),ncol=1) (i in 1:nrow(df)) { count[i,1] = length(which(df[1:i,1] == df[i,1])) } the purpose of code incrementally count each instance of specific value, e.g. count column have following result:
1,1,1,2,1,1,2,2,2,3. the code have written far job, sample df above contains 10 values. real data frame trying perform function on contains 52,118 values, takes enormous amount of time.
does know of more efficient way execute code above?
data.table solution
library(data.table) set.seed(20) dat <-data.frame(values = sample(1:3, 50000, replace=true)) setdt(dat)[,runningcount:=1:.n,values] values runningcount 1: 3 1 2: 3 2 3: 1 1 4: 2 1 5: 3 3 --- 49996: 1 16674 49997: 2 16516 49998: 2 16517 49999: 2 16518 50000: 2 16519
Comments
Post a Comment