I've been trying to find a way of sorting this with standard commandline tools, bash, awk, sort, whatever but can't find a way apart from using perl or similar.
Any hint?
Input data
header1
3
2
5
1
header2
5
1
3
.....
.....
Output data
header1
1
2
3
5
header2
1
....
Thanks
Assumes sections are separated by blank lines and the header doesn't necessarily contain the string "header". Leaves the sections in the original order so the sort is stable. Reads from stdin, displays on stdout.
#!/bin/bash
function read_section() {
while read LINE && [ "$LINE" ]; do echo "$LINE"; done
}
function sort_section() {
read HEADER && (echo "$HEADER"; sort; echo)
}
while read_section | sort_section; do :; done
Or as a one-liner:
cat test.txt | while (while read LINE && [ "$LINE" ]; do echo "$LINE"; done) | (read HEADER && (echo "$HEADER"; sort; echo)); do :; done
Try this:
mark@ubuntu:~$ cat /tmp/test.txt
header1
3
2
5
1
header2
5
1
3
mark@ubuntu:~$ cat /tmp/test.txt | awk '/header/ {colname=$1; next} {print colname, "," , $0}' | sort | awk '{if ($1 != header) {header = $1; print header} print $3}'
header1
1
2
3
5
header2
1
3
5
To get rid of the blank lines, I guess you can add a "| grep -v '^$'" at the end...
Use AWK to prefix the header
to each number line.
sort
the resulting file.
remove the prefix to return the file to original format.
with GNU awk, you can use its internal sort functions.
awk 'BEGIN{ RS=""}
{
print $1
for(i=2;i<=NF;i++){
a[i]=$i
}
b=asort(a,d)
for(i=1;i<=b;i++){
print d[i]
}
delete d
delete a
} ' file
output
# more file
header1
3
2
5
1
header2
5
1
3
# ./test.sh
header1
1
2
3
5
header2
1
3
5