I have written a script in which I want to count the number of columns in data.txt . My problem is I am unable to set the x in awk script.
Any help would be highly appreciated.
while read p; do
x=1;
echo $p | awk -F' ' '{x=NF}'
echo $x;
file="$x"".txt";
echo $file;
done <$1
data.txt file:
4495125 94307025 giovy115p@live.it 94307025.094307025 12443
stazla deva1a23@gmail.com 1992/.:\1
1447585 gioao_87@hotmail.it h1st@1
saknit tomboro@seznam.cz 1233 1990
Expected output:
5.txt
3.txt
3.txt
4.txt
My output:
1.txt
1.txt
1.txt
1.txt
You just cannot import variable set in Awk
to a shell context. In your example the value set inside x
containing NF
will be not reflected outside.
Either you need to use command substitution($(..)
) syntax to get the value of NF
and use it later
x=$(echo "$p" | awk '{print NF}')
Now x
will contain the column count in each of the line. Note that you don't need to use -F' '
which is the default de-limiter in awk
.
Besides your requirement can be fully done in Awk
itself.
awk 'NF{print NF".txt"}' file
Here the NF{..}
is to ensure that the actions inside {..}
are applied only to non-empty rows. The for each row we print the length and append the extension .txt
along with it.
Awk processes a line at a time -- processing each line in a separate Awk script inside a shell while read
loop is horrendously inefficient. See also https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice
Maybe something like this:
awk '{ print >(NF ".txt") }' data.txt
to create a file with the five-column rows in 5.txt
, the four-column ones in 4.txt
, the three-column rows in 2.txt
, etc for each unique column count.
The Awk variable NF
contains the number of fields (by default, Awk splits fields on runs of whitespace -- use -F
to change to some other separator) and the expression (NF ".txt")
simply produces a string catenation of the number of fields with the suffix .txt
which we pass as a file name to the print
redirection.
With bash:
while read p; do p=($p); echo "${#p[@]}.txt"; done < file
or shorter:
while read -a p; do echo "${#p[@]}.txt"; done < file
Output:
5.txt
3.txt
3.txt
4.txt