Using awk/find to output result and file name

2019-08-08 08:54发布

问题:

Example data in three files.

fileOne.txt

YYY >>
 yyy one
 yyy two
 yyy three
<<

ZZZ >>
 zzz one
 zzz two
 zzz three
<<

fileTwo.txt

XXX >>
 xxx one
 xxx two
 xxx three
<<

fileThree.txt

XXX >>
 xxx one
 xxx two
 xxx three
<<

ZZZ >>
 zzz one
 zzz two
 zzz three
<<

I am using awk to output portions of the file between start delimiter (XXX) and end delimiter (<<). This works:

awk '/XXX/,/<</' /d/Temp/temp/*.txt

Results

XXX >>
 xxx one
 xxx two
 xxx three
<<
XXX >>
 xxx one
 xxx two
 xxx three
<<

But I want to output the file names too. Find sort of works, but it ends out printing all the file names.

find /d/Temp/temp/ -type f -name "*.txt" -print -exec awk '/XXX/,/<</' {} \;

Results

/d/Temp/temp/fileOne.txt
/d/Temp/temp/fileThree.txt
XXX >>
 xxx one
 xxx two
 xxx three
<<
/d/Temp/temp/fileTwo.txt
XXX >>
 xxx one
 xxx two
 xxx three
<<

How can I modify this command to only output the matched file names?

回答1:

Using awk

awk '/XXX/,/<</{print a[FILENAME]?$0:FILENAME RS $0;a[FILENAME]++}' *.txt

Explanation:

/XXX/,/<</                      # output portions of the file between start delimiter (XXX) and end delimiter (<<). 
a[FILENAME]?                    # assign filename as key to array `a`, determine whether it is the true (>0) or fails (0 or null)
a[FILENAME]?$0:FILENAME RS $0   # if true, print the line only, if fail, print filename and the current line
a[FILENAME]++                   # increase the value of array a[FILENAME]


回答2:

I'm sure someone will come up with a clever solution with find and exec or xargs, but this can be pretty simply done using just bash and awk.

> for file in /d/Temp/temp/*.txt; do res=$(awk '/XXX/,/<</' "$file"); [[ $res != "" ]] && echo "$file" && echo "$res"; done
/d/Temp/temp/fileThree.txt
XXX >>
 xxx one
 xxx two
 xxx three
<<
/d/Temp/temp/fileTwo.txt
XXX >>
 xxx one
 xxx two
 xxx three
<<

Or split into a more reasonable looking shell script

#!/bin/bash
for file in "/d/Temp/temp/"*.txt; do 
  res=$(awk '/XXX/,/<</' "$file")
  [[ $res != "" ]] && echo "$file" && echo "$res" 
done

If you want it to be recursive and are using bash 4+, you can replace the starting for loop with

> shopt -s globstar; for file in /d/Temp/temp/**/*.txt; do

If you are using an older version of bash, you can replace it with a find loop

> find /d/Temp/temp/ -type f -name "*.txt" -print0 | while read -r -d '' file; do


标签: bash awk find