How to detect EOF in awk?

2020-08-13 04:52发布

问题:

Is there a way to determine whether the current line is the last line of the input stream?

回答1:

You've got two options, both kind of messy.

  1. Store a copy of every current line in a temp variable, and then use the END block to process it.
  2. Use the system command to run "wc -l | getline" in the BEGIN block to get the number of lines in the file, and then count up the that value.

You might have to play with #2 a little to get it to run, but it should work. Its been a while since I've done any awk.



回答2:

The special END pattern will match only after the end of all input. Note that this pattern can't be combined with any other pattern.

More useful is probably the getline pseudo-function which resets $0 to the next line and return 1, or in case of EOF return 0! Which I think is what you want.

For example:

awk '{ if(getline == 0) { print "Found EOF"} }'

If you are only processing one file, this would be equivalent:

awk 'END { print "Found EOF" }'


回答3:

These are the only sensible ways to do what you want, in order of best to worst:

awk 'NR==FNR{max++; next} FNR == max { print "Final line:",$0 }' file file

awk -v max="$(wc -l < file)" 'FNR == max { print "Final line:",$0 }' file

awk 'BEGIN{ while ( (getline dummy < ARGV[1]) > 0) max++; close(ARGV[1])} FNR == max { print "Final line:",$0 }' file


回答4:

Detecting the EOF is not too reliable when multiple files are on the command line. Detecting the start of the file is more reliable.

To do this, the first file is special and we ignore the FNR==1.

After the first file then FNR==1 becomes the end of the previous file. last_filename always has the filename that you are processing.

Do your file processing after the else.

Do your EOF processing inside the else block, AND in the END block.

   gawk 'BEGIN{last_filename="";} \
      FNR==1{if (last_filename==""){last_filename=FILENAME;} \
      else {print "EOF: "last_filename;last_filename=FILENAME;}} \
      END{print "END: "last_filename;}' $*

For multiple file sets, the else block executes at EOF for all but the last file. The last file is executed in the END block.

For single file sets, the else block doesn't get executed, and the END block is executed.



回答5:

gawk implementation has special rule called ENDFILE which will be triggered after processing every file in argument list. This works:

awk '{line=$0} ENDFILE {print line}' files...

more details you can find here>>



回答6:

I'm not even sure how to categorize this "solution"

{
    t = lastline
    lastline = $0
    $0 = t
}

/test/ {
    print "line <" $0 "> had a _test_"
}

END {
    # now you have "lastline", it can't be processed with the above statements
    # ...but you can work with it here
}

The cool thing about this hack is that by assigning to $0, all the remaining declarative patterns and actions work, one line delayed. You can't get them to work for the END, even if you put the END on top, but you do have control on the last line and you haven't done anything else to it.



回答7:

To detect the last line of each file in the argument list the following works nicely:

FNR == 1 || EOF {
  print "last line (" FILENAME "): " $0
}


回答8:

One easy way is to run the file via an intermediate sed script, that places a 0 on every non last line, and a 1 on the last one.

cat input_file | sed 's/^/0/;$s/0/1/' | awk '{LST=/^1/;$0=substr($0,2)}
... your awk script in which you can use LST to check for the
... last line.'


回答9:

Hmm the awk END variable tells when you have already reached the EOF. Isn't really much of help to you I guess



回答10:

you can try this:

awk 'BEGIN{PFNR=1} FNR==PFNR{PFNR++;next} {print FILENAME,PFNR=2} END{print FILENAME}' file1 file2


回答11:

A portable solution is provided in the gawk user manual, although as mentioned in another answer, gawk itself has BEGINFILE and ENDFILE.



标签: awk eof