While read line, awk $line with multiple delimiter

2019-02-19 04:22发布

问题:

I am trying a small variation of this, except I telling awk that the delimiter of the file to be split based on the 5th field can either be a colon ":" or a tab \t. I do the awk -F '[:\t]' part alone, it does indeed print the right $5 field.

However, when I try to incorporate this into the bigger command, it returns the following error:

                                                             print > f
awk: cmd. line:9:                                            ^ syntax error

This is the code:

awk -F '[:\t]' '    # read the list of numbers in Tile_Number_List
    FNR == NR {
        num[$1]
        next
    }

    # process each line of the .BAM file
    # any lines with an "unknown" $5 will be ignored
    $5 in num {
        f = "Alignments_" $5 ".sam"        print > f
    } ' Tile_Number_List.txt little.sam

Why won't it work with the -F option?

回答1:

The problem isn't with the value of FS it's this line as pointed to by the error:

f = "Alignments_" $5 ".sam"        print > f

You have two statements on one line so either separate them with a ; or a newline:

f = "Alignments_" $5 ".sam"; print > f

Or:

f = "Alignments_" $5 ".sam"
print > f

As full one liner:

awk -F '[:\t]' 'FNR==NR{n[$1];next}$5 in n{print > ("Alignments_"$5".sam")}'

Or as a script file i.e script.awk:

BEGIN {
    FS="[:\t]" 
}
# read the list of numbers in Tile_Number_List
FNR == NR {
    num[$1]
    next
}
# process each line of the .BAM file
# any lines with an "unknown" $5 will be ignored
$5 in num {
    f = "Alignments_" $5 ".sam"        
    print > f
}

To run in this form awk -f script.awk Tile_Number_List.txt little.sam.

Edit:

The character - is used to represent input from stdin instead of a file with many *nix tools.

command | awk -f script.awk Tile_Number_List.txt -