Is there a way to ignore header lines in a UNIX so

2020-01-27 00:25发布

I have a fixed-width-field file which I'm trying to sort using the UNIX (Cygwin, in my case) sort utility.

The problem is there is a two-line header at the top of the file which is being sorted to the bottom of the file (as each header line begins with a colon).

Is there a way to tell sort either "pass the first two lines across unsorted" or to specify an ordering which sorts the colon lines to the top - the remaining lines are always start with a 6-digit numeric (which is actually the key I'm sorting on) if that helps.

Example:

:0:12345
:1:6:2:3:8:4:2
010005TSTDOG_FOOD01
500123TSTMY_RADAR00
222334NOTALINEOUT01
477821USASHUTTLES21
325611LVEANOTHERS00

should sort to:

:0:12345
:1:6:2:3:8:4:2
010005TSTDOG_FOOD01
222334NOTALINEOUT01
325611LVEANOTHERS00
477821USASHUTTLES21
500123TSTMY_RADAR00

12条回答
做自己的国王
2楼-- · 2020-01-27 00:47

If you don't mind using awk, you can take advantage of awk's built-in pipe abilities

eg.

extract_data | awk 'NR<3{print $0;next}{print $0| "sort -r"}' 

This prints the first two lines verbatim and pipes the rest through sort.

Note that this has the very specific advantage of being able to selectively sort parts of a piped input. all the other methods suggested will only sort plain files which can be read multiple times. This works on anything.

查看更多
对你真心纯属浪费
3楼-- · 2020-01-27 00:47

You can use tail -n +3 <file> | sort ... (tail will output the file contents from the 3rd line).

查看更多
Fickle 薄情
4楼-- · 2020-01-27 00:47

Here's a bash shell function derived from the other answers. It handles both files and pipes. First argument is the file name or '-' for stdin. Remaining arguments are passed to sort. A couple examples:

$ hsort myfile.txt
$ head -n 100 myfile.txt | hsort -
$ hsort myfile.txt -k 2,2 | head -n 20 | hsort - -r

The shell function:

hsort ()
{
   if [ "$1" == "-h" ]; then
       echo "Sort a file or standard input, treating the first line as a header.";
       echo "The first argument is the file or '-' for standard input. Additional";
       echo "arguments to sort follow the first argument, including other files.";
       echo "File syntax : $ hsort file [sort-options] [file...]";
       echo "STDIN syntax: $ hsort - [sort-options] [file...]";
       return 0;
   elif [ -f "$1" ]; then
       local file=$1;
       shift;
       (head -n 1 $file && tail -n +2 $file | sort $*);
   elif [ "$1" == "-" ]; then
       shift;
       (read -r; printf "%s\n" "$REPLY"; sort $*);
   else
       >&2 echo "Error. File not found: $1";
       >&2 echo "Use either 'hsort <file> [sort-options]' or 'hsort - [sort-options]'";
       return 1 ;
   fi
}
查看更多
等我变得足够好
5楼-- · 2020-01-27 00:52

So here's a bash function where arguments are exactly like sort. Supporting files and pipes.

function skip_header_sort() {
    if [[ $# -gt 0 ]] && [[ -f ${@: -1} ]]; then
        local file=${@: -1}
        set -- "${@:1:$(($#-1))}"
    fi
    awk -vsargs="$*" 'NR<2{print; next}{print | "sort "sargs}' $file
}

How it works. This line checks if there is at least one argument and if the last argument is a file.

    if [[ $# -gt 0 ]] && [[ -f ${@: -1} ]]; then

This saves the file to separate argument. Since we're about to erase the last argument.

        local file=${@: -1}

Here we remove the last argument. Since we don't want to pass it as a sort argument.

        set -- "${@:1:$(($#-1))}"

Finally, we do the awk part, passing the arguments (minus the last argument if it was the file) to sort in awk. This was orignally suggested by Dave, and modified to take sort arguments. We rely on the fact that $file will be empty if we're piping, thus ignored.

    awk -vsargs="$*" 'NR<2{print; next}{print | "sort "sargs}' $file

Example usage with a comma separated file.

$ cat /tmp/test
A,B,C
0,1,2
1,2,0
2,0,1

# SORT NUMERICALLY SECOND COLUMN
$ skip_header_sort -t, -nk2 /tmp/test
A,B,C
2,0,1
0,1,2
1,2,0

# SORT REVERSE NUMERICALLY THIRD COLUMN
$ cat /tmp/test | skip_header_sort -t, -nrk3
A,B,C
0,1,2
2,0,1
1,2,0
查看更多
劳资没心,怎么记你
6楼-- · 2020-01-27 00:52
cat file_name.txt | sed 1d | sort 

This will do what you want.

查看更多
淡お忘
7楼-- · 2020-01-27 00:56

In simple cases, sed can do the job elegantly:

    your_script | (sed -u 1q; sort)

or equivalently,

    cat your_data | (sed -u 1q; sort)

The key is in the 1q -- print first line (header) and quit (leaving the rest of the input to sort).

For the example given, 2q will do the trick.

The -u switch (unbuffered) is required for those seds (notably, GNU's) that would otherwise read the input in chunks, thereby consuming data that you want to go through sort instead.

查看更多
登录 后发表回答