I have the following sed command:
sed 's/\s/,/g' input > output.csv
(I got the command from this related topic)
which turns the following input:
SNP A1 A2 FRQ INFO OR SE P
10:33367054 C T 0.9275 0.9434 1.1685 0.1281 0.1843
10:33367707 G A 0.9476 0.9436 1.0292 0.1530 0.8244
10:33367804 G C 0.4193 1.0443 0.9734 0.0988 0.6443
10:33368119 C A 0.9742 0.9343 1.0201 0.1822 0.9156
into:
SNP,,A1,,A2,,,,,FRQ,,,,INFO,,,,,,OR,,,,,,SE,,,,,,,P
10:33367054,,,C,,,T,,0.9275,,0.9434,,1.1685,,0.1281,,0.1843
10:33367707,,,G,,,A,,0.9476,,0.9436,,1.0292,,0.1530,,0.8244
10:33367804,,,G,,,C,,0.4193,,1.0443,,0.9734,,0.0988,,0.6443
10:33368119,,,C,,,A,,0.9742,,0.9343,,1.0201,,0.1822,,0.9156
I need a command that turns the multiple consecutive spaces into just one commma, to give me an output like this:
SNP,A1,A2,FRQ,INFO,OR,SE,P
10:33367054,C,T,0.9275,0.9434,1.1685,0.1281,0.1843
10:33367707,G,A,0.9476,0.9436,1.0292,0.1530,0.8244
10:33367804,G,C,0.4193,1.0443,0.9734,0.0988,0.6443
10:33368119,C,A,0.9742,0.9343,1.0201,0.1822,0.9156
Any ideas?
If you enable extended regular expressions with
-r
, then you can just add+
to\s
which means one or more:For reference:
Note: On Mac OS X,
sed
is based on BSD and does not have the GNU extensions so you will have to use the-E
flag:If you want to use
sed
, you can use this one:It is based on glenn jackman's answer to How to strip multipe spaces to one using sed?.
It can also be like
And note you can use
sed -i.bak '...' file
to get an in place edit, so that the original file will be backed up asfile.bak
andfile
will have the edited content.But I think it is more clear with
tr
. With it, you can squeeze the spaces and then replace each one of them with a comma:By pieces:
From
man tr
:Here is a very simple solution with
awk
$1=$1
reformat the file so that all extra spaces are set to one space.