I have a huge tab-separated file formatted like this
X column1 column2 column3
row1 0 1 2
row2 3 4 5
row3 6 7 8
row4 9 10 11
I would like to transpose it in an efficient way using only bash commands (I could write a ten or so lines Perl script to do that, but it should be slower to execute than the native bash functions). So the output should look like
X row1 row2 row3 row4
column1 0 3 6 9
column2 1 4 7 10
column3 2 5 8 11
I thought of a solution like this
cols=`head -n 1 input | wc -w`
for (( i=1; i <= $cols; i++))
do cut -f $i input | tr $'\n' $'\t' | sed -e "s/\t$/\n/g" >> output
done
But it's slow and doesn't seem the most efficient solution. I've seen a solution for vi in this post, but it's still over-slow. Any thoughts/suggestions/brilliant ideas? :-)
another version with
set
eval
Not very elegant, but this "single-line" command solves the problem quickly:
Here cols is the number of columns, where you can replace 4 by
head -n 1 input | wc -w
.An awk solution that store the whole array in memory
But we may "walk" the file as many times as output rows are needed:
Which (for a low count of output rows is faster than the previous code).
Here is a Bash one-liner that is based on simply converting each line to a column and
paste
-ing them together:m.txt:
creates
tmp1
file so it's not empty.reads each line and transforms it into a column using
tr
pastes the new column to the
tmp1
filecopies result back into
tmp1
.PS: I really wanted to use io-descriptors but couldn't get them to work.
I was just looking for similar bash tranpose but with support for padding. Here is the script I wrote based on fgm's solution, that seem to work. If it can be of help...
There is a purpose built utility for this,
GNU datamash utility
Taken from this site, https://www.gnu.org/software/datamash/ and http://www.thelinuxrain.com/articles/transposing-rows-and-columns-3-methods