Set an external variable in awk

2019-08-23 12:42发布

问题:

I have written a script in which I want to count the number of columns in data.txt . My problem is I am unable to set the x in awk script. Any help would be highly appreciated.

while read p; do
  x=1;
  echo $p | awk -F' ' '{x=NF}'
  echo $x;
  file="$x"".txt";
  echo $file;
  done <$1

data.txt file:

4495125 94307025    giovy115p@live.it   94307025.094307025  12443
stazla  deva1a23@gmail.com  1992/.:\1
1447585 gioao_87@hotmail.it h1st@1
saknit  tomboro@seznam.cz   1233    1990

Expected output:

5.txt
3.txt
3.txt
4.txt

My output:

1.txt
1.txt
1.txt
1.txt

回答1:

You just cannot import variable set in Awk to a shell context. In your example the value set inside x containing NF will be not reflected outside.

Either you need to use command substitution($(..)) syntax to get the value of NF and use it later

x=$(echo "$p" | awk '{print NF}')

Now x will contain the column count in each of the line. Note that you don't need to use -F' ' which is the default de-limiter in awk.

Besides your requirement can be fully done in Awk itself.

awk 'NF{print NF".txt"}' file

Here the NF{..} is to ensure that the actions inside {..} are applied only to non-empty rows. The for each row we print the length and append the extension .txt along with it.



回答2:

Awk processes a line at a time -- processing each line in a separate Awk script inside a shell while read loop is horrendously inefficient. See also https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice

Maybe something like this:

awk '{ print >(NF ".txt") }' data.txt

to create a file with the five-column rows in 5.txt, the four-column ones in 4.txt, the three-column rows in 2.txt, etc for each unique column count.

The Awk variable NF contains the number of fields (by default, Awk splits fields on runs of whitespace -- use -F to change to some other separator) and the expression (NF ".txt") simply produces a string catenation of the number of fields with the suffix .txt which we pass as a file name to the print redirection.



回答3:

With bash:

while read p; do p=($p); echo "${#p[@]}.txt"; done < file

or shorter:

while read -a p; do echo "${#p[@]}.txt"; done < file

Output:

5.txt
3.txt
3.txt
4.txt


标签: linux bash awk