Count the number of files in a directory containin

2019-08-24 10:44发布

问题:

I have few files in a directory containing below pattern:

Simulator tool completed simulation at 20:07:18 on 09/28/18.
The situation of the simulation: STATUS PASSED

Now I want to count the number of files which contains both of strings completed simulation & STATUS PASSED anywhere in the file.

This command is working to search for one string STATUS PASSED and count file numbers:

find /directory_path/*.txt -type f -exec grep -l "STATUS PASSED" {} + | wc -l

Sed is also giving 0 as a result:

find /directory_path/*.txt -type f -exec sed -e '/STATUS PASSED/!d' -e '/completed simulation/!d' {} + | wc -l

Any help/suggestion will be much appriciated!

回答1:

find . -type f -exec \
awk '/completed simulation/{x=1} /STATUS PASSED/{y=1} END{if (x&&y) print FILENAME}' {} \; |
wc -l

I'm printing the matching file names in case that's useful in some other context but piping that to wc will fail if the file names contain newlines - if that's the case just print 1 or anything else from awk.

Since find /directory_path/*.txt -type f is the same as just ls /directory_path/*.txt if all of the ".txt"s are files, though, it sounds like all you actually need is (using GNU awk for nextfile):

awk '
    FNR==1 { x=y=0 }
    /completed simulation/ { x=1 }
    /STATUS PASSED/        { y=1 }
    x && y { cnt++; nextfile }
    END { print cnt+0 }
' /directory_path/*.txt

or with any awk:

awk '
    FNR==1 { x=y=f=0 }
    /completed simulation/ { x=1 }
    /STATUS PASSED/        { y=1 }
    x && y && !f { cnt++; f=1 }
    END { print cnt+0 }
' /directory_path/*.txt

Those will work no matter what characters are in your file names.



回答2:

Using grep and standard utils:

{ grep -Hm1 'completed simulation' /directory_path/*.txt;
  grep -Hm1 'STATUS PASSED'        /directory_path/*.txt ; } |
sort | uniq -d | wc -l

grep -m1 stops when it finds the first match. This saves time if it's a big file. If the list of matches is large, sort -t: -k1 would be better than sort.



回答3:

The command find /directory_path/*.txt just lists all txt files in /directory_path/ not including subdirectories of /directory_path

find . -name \*.txt -print0 |
while read -d $'\0' file; do
  grep -Fq 'completed simulation' "$file" &&
  grep -Fq 'STATUS PASSED' "$_" &&
  echo "$_"
done |
wc -l

If you ensure no special characters in the filenames

find . -name \*.txt |
while read file; do
  grep -Fq 'completed simulation' "$file" &&
  grep -Fq 'STATUS PASSED' "$file" &&
  echo "$file"
done |
wc -l

I don't have AIX to test it, but it should be POSIX compliant.