Recursively find all files that match a certain pa

2019-01-22 01:42发布

问题:

I need to find (or more specifically, count) all files that match this pattern:

*/foo/*.doc

Where the first wildcard asterisk includes a variable number of subdirectories.

回答1:

With gnu find you can use regex, which (unlike -name) match the entire path:

find . -regex '.*/foo/[^/]*.doc'

To just count the number of files:

find . -regex '.*/foo/[^/]*.doc' -printf '%i\n' | wc -l

(The %i format code causes find to print the inode number instead of the filename; unlike the filename, the inode number is guaranteed to not have characters like a newline, so counting is more reliable. Thanks to @tripleee for the suggestion.)

I don't know if that will work on OSX, though.



回答2:

how about:

find BASE_OF_SEARCH/*/foo -name \*.doc -type f | wc -l

What this is doing:

  • start at directory BASE_OF_SEARCH/
  • look in all directories that have a directory foo
  • look for files named like *.doc
  • count the lines of the result (one per file)

The benefit of this method:

  • not recursive nor iterative (no loops)
  • it's easy to read, and if you include it in a script it's fairly easy to decipher (regex sometimes is not).

UPDATE: you want variable depth? ok:

find BASE_OF_SEARCH -name \*.doc -type f | grep foo | wc -l

  • start at directory BASE_OF_SEARCH
  • look for files named like *.doc
  • only show the lines of this result that include "foo"
  • count the lines of the result (one per file)

Optionally, you could filter out results that have "foo" in the filename, because this will show those too.



回答3:

Untested, but try:

find . -type d -name foo -print | while read d; do echo "$d/*.doc" ; done | wc -l

find all the "foo" directories (at varying depths) (this ignores symlinks, if that's part of the problem you can add them); use shell globbing to find all the ".doc" files, then count them.



回答4:

Based on the answers on this page on other pages I managed to put together the following, where a search is performed in the current folder and all others under it for all files that have the extension pdf, followed by a filtering for those that contain test_text on their title.

find . -name "*.pdf" | grep test_text | wc -l