r - An efficient way to indicate multiple indicator variables per row with composite key? -
my indicator , value objects have composite keys map each other there efficient way aggregate values indicator object?
given "empty" indicator dataframe:
indicator <- data.frame(id1=c(1,1,2,2,3,3,4,4), id2=c(10,11,10,12,10,12,10,12),ind_a=rep(0,8),ind_b=rep(0,8)) id1 id2 ind_a ind_b 1 1 10 0 0 2 1 11 0 0 3 2 10 0 0 4 2 12 0 0 5 3 10 0 0 6 3 12 0 0 7 4 10 0 0 8 4 12 0 0 and dataframe of values:
values <- data.frame(id1=c(1,1,1,2,2,3,3,4,4,4),id2=c(10,10,11,10,12,10,12,10,10,12),indicators=c('ind_a','ind_b','ind_a','ind_b','ind_a','ind_a','ind_a','ind_a','ind_b','ind_a')); id1 id2 indicators 1 1 10 ind_a 2 1 10 ind_b 3 1 11 ind_a 4 2 10 ind_b 5 2 12 ind_a 6 3 10 ind_a 7 3 12 ind_a 8 4 10 ind_a 9 4 10 ind_b 10 4 12 ind_a i want end with:
id1 id2 ind_a ind_b 1 10 1 1 1 11 1 0 2 10 0 1 2 12 1 0 3 10 1 0 3 12 1 0 4 10 1 1 4 12 1 0
you use dcast convert "values" dataset 'long' 'wide' format.
library(reshape2) dcast(values, id1+id2~indicators, value.var='indicators', length) # id1 id2 ind_a ind_b #1 1 10 1 1 #2 1 11 1 0 #3 2 10 0 1 #4 2 12 1 0 #5 3 10 1 0 #6 3 12 1 0 #7 4 10 1 1 #8 4 12 1 0 as showed above, may not need create second dataset, if need change values in 1 dataset based on value in other,
indicator$ind_a <- (do.call(paste, c(indicator[1:2], 'ind_a')) %in% do.call(paste, values))+0l indicator$ind_b <- (do.call(paste, c(indicator[1:2], 'ind_b')) %in% do.call(paste, values))+0l
Comments
Post a Comment