r - How to import and arrange a data set separated by comma? -

July 15, 2011

this specific question. have data looks below way bigger , many many files (not 1 file).

tourerg_id,rawdataid,indexno,indexvalue 19003771,11,240,1.1858652499 19003771,11,241,1.177533477 19003771,11,242,1.1704270598 19003771,11,243,1.1620838731 19003771,11,244,1.1540253051 19003771,11,245,1.1464526996 19003771,11,246,1.1394576168 19003771,11,247,1.1328267903 19003771,11,248,1.1258228114 19003771,11,249,1.1171001937  19003771,11,249,1.1237839518 19003771,11,250,1.1113389261 19003771,11,251,1.0938118176 19003771,11,252,1.0704340703 19003771,11,253,1.0418955374 19003771,11,254,1.0104241602 19003771,11,255,0.97917606379 19003771,11,256,0.95110409662 19003771,11,257,0.9277733067 19003771,11,258,0.90865127357  19000693,11,240,1.1952986902 19000693,11,241,1.1867360653 19000693,11,242,1.1793816406 19000693,11,243,1.1707059267 19000693,11,244,1.1623008189 19000693,11,245,1.1543825533 19000693,11,246,1.1470470507 19000693,11,247,1.1400880358 19000693,11,248,1.1327804778 19000693,11,249,1.1237839518  19000693,11,252,1.0704340703 19000693,11,253,1.0418955374 19000693,11,254,1.0104241602 19000693,11,255,0.97917606379 19000693,11,256,0.95110409662 19000693,11,257,0.9277733067 19000693,11,258,0.90865127357 19000693,11,259,0.89118257832 19000693,11,260,0.87161311454 19000693,11,261,0.84625725399

what want have below. means each box, keep first value before comma, add id , _1 first 1 , _2 second , keep values after last comma.

    id_19003771_1   id_19003771_2  id_19000693_1   id_19000693_2 1.1858652499   1.1237839518    1.1952986902   1.0704340703 1.177533477    1.1113389261    1.1867360653   1.0418955374 1.1704270598   1.0938118176    1.1793816406   1.0104241602 1.1620838731   1.0704340703    1.1707059267   0.97917606379 1.1540253051   1.0418955374    1.1623008189   0.95110409662 1.1464526996   1.0104241602    1.1543825533   0.9277733067 1.1394576168   0.97917606379   1.1470470507   0.90865127357 1.1328267903   0.95110409662   1.1400880358   0.89118257832 1.1258228114   0.9277733067    1.1327804778   0.87161311454 1.1171001937   0.90865127357   1.1237839518   0.84625725399

to honest, not know start

we can use read.table blank.lines.skip=false read blank lines na. use na rows create grouping variable ('gr') , split last column 'gr'. can name list elements 'tourerg_id'. if there same 'toureg_id's, use make.unique create unique 'id'. based on comments, if need separate data.frames in global environment, use list2env (though not recommended) of operations can done within list itself.

df1 <- read.table('nemo3.txt', sep=",", stringsasfactors=false,           header=true,blank.lines.skip=false) indx <- is.na(df1[,1]) gr <- cumsum(indx) lst <- split(df1[4][-which(indx),,drop=false], gr[-which(indx)]) nm1 <- tapply(df1[,1], gr,              fun= function(x) unique(x[!is.na(x)])) names(lst) <- paste('id', make.unique(as.character(nm1)), sep="_") list2env(lst, envir=.globalenv)

if need single dataset grouping column,

library(tidyr) res <- unnest(lst, group)

Search This Blog

Lix

r - How to import and arrange a data set separated by comma? -

Comments

Post a Comment

Popular posts from this blog

Email notification in google apps script -

c++ - Difference between pre and post decrement in recursive function argument -

javascript - IE11 incompatibility with jQuery's 'readonly'? -