data.table - R: Roll up column values containing NA's by sum while grouping by ID's -
i have data frame got
id <- c("a","a","a","a","b","b","b","b") type <- c(45,45,46,46,45,45,46,46) point_a <- c(10,na,30,40,na,80,na,100) point_b <- c(na,32,43,na,65,11,na,53) df <- data.frame(id,type,point_a,point_b) id type point_a point_b 1 45 10 na 2 45 na 32 3 46 30 43 4 46 40 na 5 b 45 na 65 6 b 45 80 11 7 b 46 na na 8 b 46 100 53
while learnt post, roll data id , 1 column.
i using sqldf sum rows , group id , type. while job me, slow on bigger dataset.
df1 <- sqldf("select id, type, sum(point_a) point_a, sum(point_a) point_a df group id, type")
please suggest usage of other techniques solve problem. have started learning dplyr & plyr packages , find interesting not knowing how apply here.
desired output
id type point_a point_b 1 45 10 32 2 46 70 43 3 b 45 80 76 4 b 46 100 53
using dplyr
:
df %>% group_by(id, type) %>% summarise_each(funs(sum(., na.rm = t)))
or
df %>% group_by(id, type) %>% summarise(point_a = sum(point_a, na.rm = t), point_b = sum(point_b, na.rm = t))
or
f <- function(x) sum(x, na.rm = t) df %>% group_by(id, type) %>% summarise(point_a = f(point_a), point_b = f(point_b))
which gives:
#source: local data frame [4 x 4] #groups: id # # id type point_a point_b #1 45 10 32 #2 46 70 43 #3 b 45 80 76 #4 b 46 100 53
Comments
Post a Comment