Using R to process CSV to evaluate if ((ColA != ColB) with consideration for ColC -
i'm trying achieve simple string comparison across 2 columns. sample of (mocked up) data:
emplid,from_deptcode,fromdept,to_deptcode,to_dept,transactiontypecode,transactiontype,effectivedate,changetype 0239583290,21,sales,43,customerservice,10,promotion,12/12/2012 1230495829,21,sales,21,sales,10,promotion,9/1/2013 4059503918,93,operations,93,operations,10,demotion,11/18/2014 3040593021,19,headquarters,23,international,11,reorg,12/13/2011 7029406920,15,marketing,84,development,19,reassignment,01/05/2010 2039052819,19,headquarters,19,headquarters,10,promotion,4/15/2015 the logic want use is:
if from_deptcode = to_deptcode changetype="no change" elseif from_deptcode != to_deptcode , transactiontype = "reorg" changetype="reorg" else changetype="transfer" so output like:
emplid,from_deptcode,fromdept,to_deptcode,to_dept,transactiontypecode,transactiontype,effectivedate,changetype 0239583290,21,sales,43,customerservice,10,promotion,12/12/2012,transfer 1230495829,21,sales,21,sales,10,promotion,9/1/2013,no change 4059503918,93,operations,93,operations,10,demotion,11/18/2014,no change 3040593021,19,headquarters,23,international,11,reorg,12/13/2011,reorg 7029406920,15,marketing,84,development,19,reassignment,01/05/2010,transfer 2039052819,19,headquarters,19,headquarters,10,promotion,4/15/2015,no change here's know far:
transfers <- read.csv(file="transfers.csv", head=true, sep=",",colclasses=c(na,na,na,na,na,na,na,"date",na)) at point, would, assume, implement logic:
if from_deptcode = to_deptcode changetype="no change" elseif from_deptcode != to_deptcode , transactiontype = "reorg" changetype="reorg" else changetype="transfer" i assume here i'd write out new csv write.csv(transfers, file = "transfersprocessed.csv", row.names = false)
any advice on getting rest of way there?
update:
per answer @josilber, ran following code:
transfers <- read.csv(file="transfers.csv", head=true, sep=",", colclasses=c(na,na,na,na,na,na,na,"date",na)) dat$changetype <- ifelse(dat$from_deptcode == dat$to_deptcode, "no change",ifelse(dat$transactiontype == "reorg", "reorg", "transfer")) view(transfers) on following data:
emplid,from_deptcode,fromdept,to_deptcode,to_dept,transactiontypecode,transactiontype,effectivedate,changetype 0239583290,21,sales,43,customerservice,10,promotion,12/12/2012 1230495829,21,sales,21,sales,10,promotion,9/1/2013 4059503918,93,operations,93,operations,10,demotion,11/18/2014 3040593021,19,headquarters,23,international,11,reorg,12/13/2011 7029406920,15,marketing,84,development,19,reassignment,01/05/2010 2039052819,19,headquarters,19,headquarters,10,promotion,4/15/2015 and changetype variable remained "na".
is nested ifelse statement syntax correct? idea why changetype isn't working?
you can nested ifelse statement:
dat$changetype <- ifelse(dat$from_deptcode == dat$to_deptcode, "no change", ifelse(dat$transactiontype == "reorg", "reorg", "transfer")) dat # emplid from_deptcode fromdept to_deptcode to_dept transactiontypecode # 1 239583290 21 sales 43 customerservice 10 # 2 1230495829 21 sales 21 sales 10 # 3 4059503918 93 operations 93 operations 10 # 4 3040593021 19 headquarters 23 international 11 # 5 7029406920 15 marketing 84 development 19 # 6 2039052819 19 headquarters 19 headquarters 10 # transactiontype effectivedate changetype # 1 promotion 12/12/2012 transfer # 2 promotion 9/1/2013 no change # 3 demotion 11/18/2014 no change # 4 reorg 12/13/2011 reorg # 5 reassignment 01/05/2010 transfer # 6 promotion 4/15/2015 no change the ifelse passed vector of true/false values first argument, using second argument true cases , using third argument false cases. false cases want run ifelse, why logic nested here.
note large data frames deal quicker looping through data , doing nested if statement 1 row @ time.
Comments
Post a Comment