I am learning file comparison using awk
.
I found syntax like below,
awk 'NR==FNR{a[$1];next}$1 in a{print $1}' file1 file2
I couldn't understand what is significance of NR==FNR
in this?
If I try with FNR==NR
then also I get same output?
What exactly it does ?
Look for keys (first word of line) in file2 that are also in file1.
Step 1: fill array a with the first words of file 1:
Step 2: Fill array a and ignore file 2 in the same command. For this check the total number of records until now with the number of the current input file.
Step 3: Ignore actions that might come after
}
when parsing file 1Step 4: print key of file2 when found in the array a
Assuming you have Files a.txt and b.txt with
Keep in mind NR and FNR are awk built-in variables. NR - Gives the total number of records processed. (in this case both in a.txt and b.txt) FNR - Gives the total number of records for each input file (records in either a.txt or b.txt)
lets Add "next" to skip the first matched with NR==FNR
in b.txt and in a.txt
in b.txt but not in a.txt
In awk,
FNR
refers to the record number (typically the line number) in the current file andNR
refers to the total record number. The operator==
is a comparison operator, which returns true when the two surrounding operands are equal.This means that the condition
NR==FNR
is only true for the first file, asFNR
resets back to 1 for the first line of each file butNR
keeps on increasing.This pattern is typically used to perform actions on only the first file. The
next
inside the block means any further commands are skipped, so they are only run on files other than the first.The condition
FNR==NR
compares the same two operands asNR==FNR
, so it behaves in the same way.Look up
NR
andFNR
in the awk manual and then ask yourself what is the condition under whichNR==FNR
in the following example:There are
awk
built-in variables.NR
- It gives the total number of records processed.FNR
- It gives the total number of records for each input file.