Using AWK to filter out column with numerical rang

2019-03-10 16:58发布

问题:

I'm relatively new to BASH and I'm trying to use awk to filter out column 1 data based on the 4th column of a text file. If the 4th column of data matches the range of x, then it'll output column 1 data. "x" is suppose to be a range of numbers 1-10 (1,2,3..10).

awk -F: '{ if($4=="x") print $1}' filename.txt

filename.txt 
sample1 0 0 4
sample2 0 0 10
sample3 0 0 15
sample4 0 0 20

Actual use:

awk -F: '{ if($4=="1-10") print $1}' sample.txt
output = sample1, sample2, sample3, sample4

It should be: sample1 sample2 only.

Is there is an error in the syntax that I'm not seeing or I could be possibly using this syntax completely wrong?

回答1:

awk '{ if ($4 >= 1 && $4 <= 10) print $1 }' sample.txt


回答2:

awk '$4 ~ /^[1-9]$|^10$/{print $1}' sample.txt

output:

sample1
sample2

explanation:

  • ^[1-9]$ --> $4 must be a single digit from 1 to 9
  • | (the pipe) --> or
  • ^10$ --> $4 must be the number 10


回答3:

awk -F ':' '$4 >= 1 && $4 <= 10{print $1}'


回答4:

There may be a way to do it using only awk (nevermind, see my edit below), but I don't know of it. I'd combine it with grep:

egrep ' ([1-9]|10)$' sample.txt | awk '{print $1}'

I think you are matching the fourth column with the string "1-10" not the range. Also, -F: will change the delimiter to a colon rather than a space.

Edit:

awk '$4 ~ /^([1-9]|10)$/ {print $1}' sample.txt


回答5:

If you want awk to look up values from a range then you can set that range in the BEGIN statement.

awk 'BEGIN{for (i=1;i<=10;i++) a[i]} ($4 in a){print $1}' sample.txt 

Test:

[jaypal:~/Temp] cat sample.txt 
sample1 0 0 4
sample2 0 0 10
sample3 0 0 15
sample4 0 0 20
[jaypal:~/Temp] awk 'BEGIN{for (i=1;i<=10;i++) a[i]} ($4 in a){print $1}' sample.txt 
sample1
sample2


回答6:

If Perl is an option, you can try this solution similar to Kambus's awk solution:

perl -lane 'print $F[0] if $F[3] >= 1 && $F[3] <= 10' sample.txt

These command-line options are used:

  • -n loop around every line of the input file, do not automatically print every line

  • -l removes newlines before processing, and adds them back in afterwards

  • -a autosplit mode – split input lines into the @F array.

  • -e execute the perl code

@F is the array of words in each line, indexed starting with 0