Sort subgroups of lines with command-line tools

2019-09-01 03:07发布

问题:

I've been trying to find a way of sorting this with standard commandline tools, bash, awk, sort, whatever but can't find a way apart from using perl or similar.

Any hint?

Input data

header1
3
2
5
1

header2
5
1
3
.....
.....

Output data

header1
1
2
3
5

header2
1
....

Thanks

回答1:

Assumes sections are separated by blank lines and the header doesn't necessarily contain the string "header". Leaves the sections in the original order so the sort is stable. Reads from stdin, displays on stdout.

#!/bin/bash

function read_section() {
    while read LINE && [ "$LINE" ]; do echo "$LINE"; done
}

function sort_section() {
    read HEADER && (echo "$HEADER"; sort; echo)
}

while read_section | sort_section; do :; done

Or as a one-liner:

cat test.txt | while (while read LINE && [ "$LINE" ]; do echo "$LINE"; done) | (read HEADER && (echo "$HEADER"; sort; echo)); do :; done


回答2:

Try this:

mark@ubuntu:~$ cat /tmp/test.txt
header1
3
2
5
1

header2
5
1
3
mark@ubuntu:~$ cat /tmp/test.txt | awk '/header/ {colname=$1; next} {print colname, "," , $0}'  | sort | awk '{if ($1 != header) {header = $1; print header} print $3}'
header1

1
2
3
5
header2
1
3
5

To get rid of the blank lines, I guess you can add a "| grep -v '^$'" at the end...



回答3:

Use AWK to prefix the header to each number line.
sort the resulting file.
remove the prefix to return the file to original format.



回答4:

with GNU awk, you can use its internal sort functions.

awk 'BEGIN{ RS=""}
{
    print $1
    for(i=2;i<=NF;i++){
        a[i]=$i
    }
    b=asort(a,d)
    for(i=1;i<=b;i++){    
        print d[i]
    }
    delete d
    delete a    
} ' file

output

# more file
header1
3
2
5
1

header2
5
1
3
# ./test.sh
header1
1
2
3
5
header2
1
3
5