I have a huge tab-separated file formatted like this
X column1 column2 column3
row1 0 1 2
row2 3 4 5
row3 6 7 8
row4 9 10 11
I would like to transpose it in an efficient way using only bash commands (I could write a ten or so lines Perl script to do that, but it should be slower to execute than the native bash functions). So the output should look like
X row1 row2 row3 row4
column1 0 3 6 9
column2 1 4 7 10
column3 2 5 8 11
I thought of a solution like this
cols=`head -n 1 input | wc -w`
for (( i=1; i <= $cols; i++))
do cut -f $i input | tr $'\n' $'\t' | sed -e "s/\t$/\n/g" >> output
done
But it's slow and doesn't seem the most efficient solution. I've seen a solution for vi in this post, but it's still over-slow. Any thoughts/suggestions/brilliant ideas? :-)
I was looking for a solution to transpose any kind of matrix (nxn or mxn) with any kind of data (numbers or data) and got the following solution:
If you have
sc
installed, you can do:I normally use this little
awk
snippet for this requirement:This just loads all the data into a bidimensional array
a[line,column]
and then prints it back asa[column,line]
, so that it transposes the given input.This needs to keep track of the
max
imum amount of columns the initial file has, so that it is used as the number of rows to print back.the transpose project on sourceforge is a coreutil-like C program for exactly that.
A hackish perl solution can be like this. It's nice because it doesn't load all the file in memory, prints intermediate temp files, and then uses the all-wonderful paste
I used fgm's solution (thanks fgm!), but needed to eliminate the tab characters at the end of each row, so modified the script thus: