Is there a way to ignore header lines in a UNIX so

2020-01-27 00:25发布

I have a fixed-width-field file which I'm trying to sort using the UNIX (Cygwin, in my case) sort utility.

The problem is there is a two-line header at the top of the file which is being sorted to the bottom of the file (as each header line begins with a colon).

Is there a way to tell sort either "pass the first two lines across unsorted" or to specify an ordering which sorts the colon lines to the top - the remaining lines are always start with a 6-digit numeric (which is actually the key I'm sorting on) if that helps.

Example:

:0:12345
:1:6:2:3:8:4:2
010005TSTDOG_FOOD01
500123TSTMY_RADAR00
222334NOTALINEOUT01
477821USASHUTTLES21
325611LVEANOTHERS00

should sort to:

:0:12345
:1:6:2:3:8:4:2
010005TSTDOG_FOOD01
222334NOTALINEOUT01
325611LVEANOTHERS00
477821USASHUTTLES21
500123TSTMY_RADAR00

12条回答
冷血范
2楼-- · 2020-01-27 00:58

This is the same as Ian Sherbin answer but my implementation is :-

cut -d'|' -f3,4,7 $arg1 | uniq > filetmp.tc
head -1 filetmp.tc > file.tc;
tail -n+2 filetmp.tc | sort -t"|" -k2,2 >> file.tc;
查看更多
对你真心纯属浪费
3楼-- · 2020-01-27 01:01
head -2 <your_file> && nawk 'NR>2' <your_file> | sort

example:

> cat temp
10
8
1
2
3
4
5
> head -2 temp && nawk 'NR>2' temp | sort -r
10
8
5
4
3
2
1
查看更多
叼着烟拽天下
4楼-- · 2020-01-27 01:04

Here is a version that works on piped data:

(read -r; printf "%s\n" "$REPLY"; sort)

If your header has multiple lines:

(for i in $(seq $HEADER_ROWS); do read -r; printf "%s\n" "$REPLY"; done; sort)

This solution is from here

查看更多
smile是对你的礼貌
5楼-- · 2020-01-27 01:07
(head -n 2 <file> && tail -n +3 <file> | sort) > newfile

The parentheses create a subshell, wrapping up the stdout so you can pipe it or redirect it as if it had come from a single command.

查看更多
等我变得足够好
6楼-- · 2020-01-27 01:11

It only takes 2 lines of code...

head -1 test.txt > a.tmp; 
tail -n+2 test.txt | sort -n >> a.tmp;

For a numeric data, -n is required. For alpha sort, the -n is not required.

Example file:
$ cat test.txt

header
8
5
100
1
-1

Result:
$ cat a.tmp

header
-1
1
5
8
100

查看更多
家丑人穷心不美
7楼-- · 2020-01-27 01:11

With Python:

import sys
HEADER_ROWS=2

for _ in range(HEADER_ROWS):
    sys.stdout.write(next(sys.stdin))
for row in sorted(sys.stdin):
    sys.stdout.write(row)
查看更多
登录 后发表回答