An efficient way to transpose a file in Bash-第2页回答

I have a huge tab-separated file formatted like this

X column1 column2 column3
row1 0 1 2
row2 3 4 5
row3 6 7 8
row4 9 10 11

I would like to transpose it in an efficient way using only bash commands (I could write a ten or so lines Perl script to do that, but it should be slower to execute than the native bash functions). So the output should look like

X row1 row2 row3 row4
column1 0 3 6 9
column2 1 4 7 10
column3 2 5 8 11

I thought of a solution like this

cols=`head -n 1 input | wc -w`
for (( i=1; i <= $cols; i++))
do cut -f $i input | tr $'\n' $'\t' | sed -e "s/\t$/\n/g" >> output
done

But it's slow and doesn't seem the most efficient solution. I've seen a solution for vi in this post, but it's still over-slow. Any thoughts/suggestions/brilliant ideas? :-)

标签： bash parsing unix transpose

25条回答

浪荡孟婆

2楼-- · 2018-12-31 08:31

I was looking for a solution to transpose any kind of matrix (nxn or mxn) with any kind of data (numbers or data) and got the following solution:

Row2Trans=number1
Col2Trans=number2

for ((i=1; $i <= Line2Trans; i++));do
    for ((j=1; $j <=Col2Trans ; j++));do
        awk -v var1="$i" -v var2="$j" 'BEGIN { FS = "," }  ; NR==var1 {print $((var2)) }' $ARCHIVO >> Column_$i
    done
done

paste -d',' `ls -mv Column_* | sed 's/,//g'` >> $ARCHIVO

0人赞添加讨论(0) 举报

高级女魔头

3楼-- · 2018-12-31 08:34

If you have sc installed, you can do:

psc -r < inputfile | sc -W% - > outputfile

0人赞添加讨论(0) 举报

柔情千种

4楼-- · 2018-12-31 08:38

I normally use this little awk snippet for this requirement:

  awk '{for (i=1; i<=NF; i++) a[i,NR]=$i
        max=(max<NF?NF:max)}
        END {for (i=1; i<=max; i++)
              {for (j=1; j<=NR; j++) 
                  printf "%s%s", a[i,j], (j==NR?RS:FS)
              }
        }' file

This just loads all the data into a bidimensional array a[line,column] and then prints it back as a[column,line], so that it transposes the given input.

This needs to keep track of the maximum amount of columns the initial file has, so that it is used as the number of rows to print back.

0人赞添加讨论(0) 举报

君临天下

5楼-- · 2018-12-31 08:40

the transpose project on sourceforge is a coreutil-like C program for exactly that.

gcc transpose.c -o transpose
./transpose -t input > output #works with stdin, too.

0人赞添加讨论(0) 举报

刘海飞了

6楼-- · 2018-12-31 08:40

A hackish perl solution can be like this. It's nice because it doesn't load all the file in memory, prints intermediate temp files, and then uses the all-wonderful paste

#!/usr/bin/perl
use warnings;
use strict;

my $counter;
open INPUT, "<$ARGV[0]" or die ("Unable to open input file!");
while (my $line = <INPUT>) {
    chomp $line;
    my @array = split ("\t",$line);
    open OUTPUT, ">temp$." or die ("unable to open output file!");
    print OUTPUT join ("\n",@array);
    close OUTPUT;
    $counter=$.;
}
close INPUT;

# paste files together
my $execute = "paste ";
foreach (1..$counter) {
    $execute.="temp$counter ";
}
$execute.="> $ARGV[1]";
system $execute;

0人赞添加讨论(0) 举报

笑指拈花

7楼-- · 2018-12-31 08:40

I used fgm's solution (thanks fgm!), but needed to eliminate the tab characters at the end of each row, so modified the script thus:

#!/bin/bash 
declare -a array=( )                      # we build a 1-D-array

read -a line < "$1"                       # read the headline

COLS=${#line[@]}                          # save number of columns

index=0
while read -a line; do
    for (( COUNTER=0; COUNTER<${#line[@]}; COUNTER++ )); do
        array[$index]=${line[$COUNTER]}
        ((index++))
    done
done < "$1"

for (( ROW = 0; ROW < COLS; ROW++ )); do
  for (( COUNTER = ROW; COUNTER < ${#array[@]}; COUNTER += COLS )); do
    printf "%s" ${array[$COUNTER]}
    if [ $COUNTER -lt $(( ${#array[@]} - $COLS )) ]
    then
        printf "\t"
    fi
  done
  printf "\n" 
done

0人赞添加讨论(0) 举报

An efficient way to transpose a file in Bash

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间