regex - Regular expression to remove commas after the first -
i have file looks like:
16262|john, doe|john|doe|jd|etc...
i need find , replace cases as:
16262|john, doe, dae|john|doe dae|jd|etc...
by
16262|john, doe dae|john|doe dae|jd|etc...
in summary, want alter in second field commas after first (may more 1 after).
any suggestion?
with gnu sed:
bre syntax:
sed 's/\(\(^\||\)[^|,]*,\) \?\|, \?/\1 /g;'
ere syntax:
sed -r 's/((^|\|)[^|,]*,) ?|, ?/\1 /g;'
details:
( # group 1: begining of item until first comma ( # group 2: ^ # start of line | # or \| # delimiter ) [^|,]* # start of item until | or , , # first comma ) # close capture group 1 [ ]? # optional space | # or , # other comma [ ]?
when first branch succeeds, first comma captured in group 1 begining of item, since replacement string contains reference capture group 1 (\1), first comma stay unchanged.
when second branch succeeds group 1 not defined , reference \1 in replacement string empty string. why other commas removed.
Comments
Post a Comment