string - How to calculate and substitute values in a specific column of tabular data? -


given following input:

mccc processed: unknown event at: tue, 14 oct 2014 12:02:26 cst  station, mccc delay,    std,    cc coeff,  cc std,   pol   , t0_times  , delay_times  zj.uno1     -0.7964    0.0051    0.9690    0.0139    0  graw.bhz   301.1263    -1.8041  zj.dose     -0.7065    0.0072    0.9760    0.0133    0  knyn.bhz   301.3372    -1.9249  zj.tres      0.9675    0.0072    0.9548    0.0292    0  leon.bhz   301.2611    -0.1749 phase: p         pde    2013  7 15 14  6 58.00   -60.867   -25.143   31.0  0.0  7.3  

i want remove mean of 9th column (delay_times) each of delay_times, requires summing 9th column values, dividing number of values, , subtracting mean each of values (-1.8041, -1.9249, -0.1749).

i confused begin endeavor. i've provided starting script below:

#!/usr/bin/perl use strict; use warnings;  open $file '<', "file.txt" or die $!;  while (<$file>) {     ($name, $time) = (split /\s+/, $file)[1,9];  # calculate mean of 9th column every row begins zj,  # , subtract mean each value (time) in 9th column. }  # output new file mean removed each "time" in 9th column 

would easier in awk, or perl? thank you.

your attempted perl solution pretty spot on - didn't finish :-)

an "extended 1 liner" follows:

perl -ane 'push @f,[@f] }{ (@f){ $s += $_->[8] , $n++ if $_->[0] =~ /zj/ }             $_->[0] =~ /zj/ ? ( "@{$_}[0..7] ", $_->[8]-($s/$n) ) : "@$_"             @f'  data.txt 

slightly shortened using -an (see perlrun) not golfish. awk solution @john1024 read file twice - in case 2 for loops. use ternary operator (<cond> ? :) print out - or say - each lines, either (@$_), or field substitution.

output:

mccc processed: unknown event at: tue, 14 oct 2014 12:02:26 cst station, mccc delay, std, cc coeff, cc std, pol , t0_times , delay_times zj.uno1 -0.7964 0.0051 0.9690 0.0139 0 graw.bhz 301.1263 -0.5028 zj.dose -0.7065 0.0072 0.9760 0.0133 0 knyn.bhz 301.3372 -0.6236 zj.tres 0.9675 0.0072 0.9548 0.0292 0 leon.bhz 301.2611 1.1264 phase: p pde 2013 7 15 14 6 58.00 -60.867 -25.143 31.0 0.0 7.3 

as script might read follows:

use v5.16;  (@timedata, $rec, $sum, $n) ;  while (<data>) {     push @timedata, [ split(" ") ] ; }  foreach $rec (@timedata){    $sum += $rec->[8] , $n++ if $rec->[0] =~ /zj/ ; }      foreach $rec (@timedata) {  $rec->[0] =~ /zj/  ?  ( "@{$rec}[0..7] ", $rec->[8]-($sum/$n) )                          :    "@$rec" ; }  __data__ mccc processed: unknown event at: tue, 14 oct 2014 12:02:26 cst  station, mccc delay,  std,   cc coeff,  cc std,   pol   , t0_times  , delay_times  zj.uno1  -0.7964    0.0051    0.9690    0.0139    0  graw.bhz   301.1263 -1.8041  zj.dose  -0.7065    0.0072    0.9760    0.0133    0  knyn.bhz   301.3372 -1.9249  zj.tres   0.9675    0.0072    0.9548    0.0292    0  leon.bhz   301.2611 -0.1749 phase: p         pde    2013  7 15 14  6 58.00   -60.867   -25.143   31.0  0.0  7.3  

there may way avoid 2 loops (combining while or map for wouldn't count though), creating sum , average in 1 pass , substituting in makes script clear , simple.


Comments

Popular posts from this blog

c++ - Difference between pre and post decrement in recursive function argument -

php - Nothing but 'run(); ' when browsing to my local project, how do I fix this? -

php - How can I echo out this array? -