Turning multi-line string into single comma-separa

2019-01-12 22:33发布

问题:

Let's say I have the following string:

something1:    +12.0   (some unnecessary trailing data (this must go))
something2:    +15.5   (some more unnecessary trailing data)
something4:    +9.0   (some other unnecessary data)
something1:    +13.5  (blah blah blah)

How do I turn that into simply

+12.0,+15.5,+9.0,+13.5

in bash?

回答1:

You can use awk and sed:

awk -vORS=, '{ print $2 }' file.txt | sed 's/,$/\n/'

Or if you want to use a pipe:

echo "data" | awk -vORS=, '{ print $2 }' | sed 's/,$/\n/'

To break it down:

  • awk is great at handling data broken down into fields
  • -vORS=, sets the "output record separator" to ,, which is what you wanted
  • { print $2 } tells awk to print the second field for every record (line)
  • file.txt is your filename
  • sed just gets rid of the trailing , and turns it into a newline (if you want no newline, you can do s/,$//)


回答2:

Clean and simple:

awk '{print $2}' file.txt | paste -s -d, -


回答3:

$ awk -v ORS=, '{print $2}' data.txt | sed 's/,$//'
+12.0,+15.5,+9.0,+13.5

$ cat data.txt | tr -s ' ' | cut -d ' ' -f 2 | tr '\n' ',' | sed 's/,$//'
+12.0,+15.5,+9.0,+13.5


回答4:

This should work too

awk '{print $2}' file | sed ':a;{N;s/\n/,/};ba'


回答5:

cat data.txt | xargs | sed -e 's/ /, /g'


回答6:

This might work for you:

cut -d' ' -f5 file | paste -d',' -s
+12.0,+15.5,+9.0,+13.5

or

sed '/^.*\(+[^ ]*\).*/{s//\1/;H};${x;s/\n/,/g;s/.//p};d' file
+12.0,+15.5,+9.0,+13.5


回答7:

awk one liner

$ awk '{printf (NR>1?",":"") $2}' file

+12.0,+15.5,+9.0,+13.5


回答8:

try this:

sedSelectNumbers='s".* \(+[0-9]*[.][0-9]*\) .*"\1,"'
sedClearLastComma='s"\(.*\),$"\1"'
cat file.txt |sed "$sedSelectNumbers" |tr -d "\n" |sed "$sedClearLastComma"

the good thing is the easy part of deleting newline "\n" characters!

EDIT: another great way to join lines into a single line with sed is this: |sed ':a;N;$!ba;s/\n/ /g' got from here.



回答9:

Don't seen this simple solution with awk

awk 'b{b=b","}{b=b$2}END{print b}' infile


回答10:

You can use grep:

grep -o "+\S\+" in.txt | tr '\n' ','

which finds the string starting with +, followed by any string \S\+, then convert new line characters into commas. This should be pretty quick for large files.



回答11:

With perl:

fg@erwin ~ $ perl -ne 'push @l, (split(/\s+/))[1]; END { print join(",", @l) . "\n" }' <<EOF
something1:    +12.0   (some unnecessary trailing data (this must go))
something2:    +15.5   (some more unnecessary trailing data)
something4:    +9.0   (some other unnecessary data)
something1:    +13.5  (blah blah blah)
EOF

+12.0,+15.5,+9.0,+13.5


回答12:

You can also do it with two sed calls:

$ cat file.txt 
something1:    +12.0   (some unnecessary trailing data (this must go))
something2:    +15.5   (some more unnecessary trailing data)
something4:    +9.0   (some other unnecessary data)
something1:    +13.5  (blah blah blah)
$ sed 's/^[^:]*: *\([+0-9.]\+\) .*/\1/' file.txt | sed -e :a -e '$!N; s/\n/,/; ta'
+12.0,+15.5,+9.0,+13.5

First sed call removes uninteresting data, and the second join all lines.



回答13:

You can also print like this:

Just awk: using printf

bash-3.2$ cat sample.log
something1:    +12.0   (some unnecessary trailing data (this must go))
something2:    +15.5   (some more unnecessary trailing data)
something4:    +9.0   (some other unnecessary data)
something1:    +13.5  (blah blah blah)

bash-3.2$ awk ' { if($2 != "") { if(NR==1) { printf $2 } else { printf "," $2 } } }' sample.log
+12.0,+15.5,+9.0,+13.5


回答14:

Another Perl solution, similar to Dan Fego's awk:

perl -ane 'print "$F[1],"' file.txt | sed 's/,$/\n/'

-a tells perl to split the input line into the @F array, which is indexed starting at 0.



回答15:

A solution written in pure Bash:

#!/bin/bash

sometext="something1:    +12.0   (some unnecessary trailing data (this must go))
something2:    +15.5   (some more unnecessary trailing data)
something4:    +9.0   (some other unnecessary data)
something1:    +13.5  (blah blah blah)"

a=()
while read -r a1 a2 a3; do
    # we can add some code here to check valid values or modify them
    a+=("${a2}")
done <<< "${sometext}"
# between parenthesis to modify IFS for the current statement only
(IFS=',' ; printf '%s: %s\n' "Result" "${a[*]}")

Result: +12.0,+15.5,+9.0,+13.5



回答16:

Well the hardest part probably is selecting the second "column" since I wouldn't know of an easy way to treat multiple spaces as one. For the rest it's easy. Use bash substitutions.

# cat bla.txt
something1:    +12.0   (some unnecessary trailing data (this must go))
something2:    +15.5   (some more unnecessary trailing data)
something4:    +9.0   (some other unnecessary data)
something1:    +13.5  (blah blah blah)

# cat bla.sh
OLDIFS=$IFS
IFS=$'\n'
for i in $(cat bla.txt); do
  i=$(echo "$i" | awk '{print $2}')
  u="${u:+$u, }$i"
done
IFS=$OLDIFS
echo "$u"

# bash ./bla.sh
+12.0, +15.5, +9.0, +13.5