How to exclude a directory in find . command

2018-12-31 09:21发布

I'm trying to run a find command for all JavaScript files, but how do I exclude a specific directory?

Here is the find code we're using.

for file in $(find . -name '*.js')
do 
  java -jar config/yuicompressor-2.4.2.jar --type js $file -o $file
done

标签: linux shell find
30条回答
心情的温度
2楼-- · 2018-12-31 10:12

This works because find TESTS the files for the pattern "*foo*":

find ! -path "dir1" ! -path "dir2" -name "*foo*"

but it does NOT work if you don't use a pattern (find does not TEST the file). So find makes no use of its former evaluated "true" & "false" bools. Example for not working use case with above notation:

find ! -path "dir1" ! -path "dir2" -type f

There is no find TESTING! So if you need to find files without any pattern matching use the -prune. Also, by the use of prune find is always faster while it really skips that directories instead of matching it or better not matching it. So in that case use something like:

find dir -not \( -path "dir1" -prune \) -not \( -path "dir2" -prune \) -type f

or:

find dir -not \( -path "dir1" -o -path "dir2" -prune \) -type f

Regards

查看更多
梦寄多情
3楼-- · 2018-12-31 10:12

For FreeBSD users:

 find . -name '*.js' -not -path '*exclude/this/dir*'
查看更多
何处买醉
4楼-- · 2018-12-31 10:13

There are lots of answers here already; I'm reluctant to add another, but I think that this information is useful.

TLDR: understand your root directories and tailor your search from there, using the "-prune" option.

Background: I have a rsnapshot (rsync) backup location, /mnt/Backups/, that causes headaches when searching for system (/) files, as those backups comprise ~ 4.5TB (terra) of files!

I also have /mnt/Vancouver, my main working folder with TB of files, that is backed up [/mnt/Backups/ and /mnt/Vancouver/ are physically (redundantly) mounted on separate drives].


Of the two top answers here (How to exclude a directory in find . command), I find that searching system files using the accepted answer is much faster, with caveats.

THIS one

find / -path /mnt -prune -o -name "*libname-server-2.a*" -print

finds that file in ~3-4 seconds; this one

find / -name "*libname-server-2.a*" -not -path "/mnt/*"

appears (?) to recurse through all of the excluded directories (deeply nested rsync snapshots of all mounted volumes), so it takes forever. I'm presuming that it is searching multi-TB of files, so it's bogged down, interminably. For example, if I attempt to "time" that search (time find ...), I see copious output -- suggesting that find is deeply traversing the "excluded" directory:

...
find: ‘/mnt/Backups/rsnapshot_backups/monthly.0/snapshot_root/var/lib/udisks2’: Permission denied
...

Appending a forward slash after the excluded directory (/mnt/) or a nested path (`/mnt/Backups') results in that search again* taking forever:

Slow:

find / -path /mnt/ -prune -o -name "*libname-server-2.a*" -print
find / -path /mnt/Vancouver -prune -o -name "*libname-server-2.a*" -print

"SOLUTION"

Here are the best solutions (all of these execute in seconds). Again, my directory structure is

  • / : root
  • /mnt/Backups/ : multi-TB backups
  • /mnt/Vancouver/ : multi-TB working directory (backed up to /mnt/Backups on separate drive), which I often want to search
  • /home/* : other mountpoints/working "drives" (e.g. /home/victoria = ~)

System files (/):

To quickly find a system file, exclude /mnt (not /mnt/ or /mnt/Backups, ...):

$ find / -path /mnt -prune -o -name "*libname-server-2.a*" -print
/usr/lib/libname-server-2.a

which finds that file in ~3-4 seconds.

Non-system files:

E.g. to quickly locate a file in one of my two working "drives", /mnt/Vancouver/ and/or /home/victoria/).

$ find /mnt/Vancouver/ -name "*04t8ugijrlkj.jpg"
/mnt/Vancouver/temp/04t8ugijrlkj.jpg

$ find /home/victoria -iname "*Untitled Document 1"
/home/victoria/backups/shortcuts.bak.2016.11.02/Untitled Document 1
/home/victoria/Untitled Document 1

Backups:

E.g. to find a deleted file, in one of my hourly/daily/weekly/monthly backups).

$ find /mnt/Backups/rsnapshot_backups/daily.0 -name "*04t8ugijrlkj.jpg"
/mnt/Backups/rsnapshot_backups/daily.0/snapshot_root/mnt/Vancouver/temp/04t8ugijrlkj.jpg 

Aside: Adding -print at the end of the command suppresses the printout of the excluded directory:

$ find / -path /mnt -prune -o -name "*libname-server-2.a*"
/mnt
/usr/lib/libname-server-2.a

$ find / -path /mnt -prune -o -name "*libname-server-2.a*" -print
/usr/lib/libname-server-2.a
$ 
查看更多
不流泪的眼
5楼-- · 2018-12-31 10:14

This is suitable for me on a Mac:

find . -name *.php -or -path "./vendor" -prune -or -path "./app/cache" -prune

It will exclude vendor and app/cache dir for search name which suffixed with php.

查看更多
旧人旧事旧时光
6楼-- · 2018-12-31 10:14

If search directories has pattern (in my case most of the times); you can simply do it like below:

find ./n* -name "*.tcl" 

In above example; it searches in all the sub-directories starting with "n".

查看更多
梦醉为红颜
7楼-- · 2018-12-31 10:15

There is clearly some confusion here as to what the preferred syntax for skipping a directory should be.

GNU Opinion

To ignore a directory and the files under it, use -prune

From the GNU find man page

Reasoning

-prune stops find from descending into a directory. Just specifying -not -path will still descend into the skipped directory, but -not -path will be false whenever find tests each file.

Issues with -prune

-prune does what it's intended to, but are still some things you have to take care of when using it.

  1. find prints the pruned directory.

    • TRUE That's intended behavior, it just doesn't descend into it. To avoid printing the directory altogether, use a syntax that logically omits it.
  2. -prune only works with -print and no other actions.

    • NOT TRUE. -prune works with any action except -delete. Why doesn't it work with delete? For -delete to work, find needs to traverse the directory in DFS order, since -deletewill first delete the leaves, then the parents of the leaves, etc... But for specifying -prune to make sense, find needs to hit a directory and stop descending it, which clearly makes no sense with -depth or -delete on.

Performance

I set up a simple test of the three top upvoted answers on this question (replaced -print with -exec bash -c 'echo $0' {} \; to show another action example). Results are below

----------------------------------------------
# of files/dirs in level one directories
.performance_test/prune_me     702702    
.performance_test/other        2         
----------------------------------------------

> find ".performance_test" -path ".performance_test/prune_me" -prune -o -exec bash -c 'echo "$0"' {} \;
.performance_test
.performance_test/other
.performance_test/other/foo
  [# of files] 3 [Runtime(ns)] 23513814

> find ".performance_test" -not \( -path ".performance_test/prune_me" -prune \) -exec bash -c 'echo "$0"' {} \;
.performance_test
.performance_test/other
.performance_test/other/foo
  [# of files] 3 [Runtime(ns)] 10670141

> find ".performance_test" -not -path ".performance_test/prune_me*" -exec bash -c 'echo "$0"' {} \;
.performance_test
.performance_test/other
.performance_test/other/foo
  [# of files] 3 [Runtime(ns)] 864843145

Conclusion

Both f10bit's syntax and Daniel C. Sobral's syntax took 10-25ms to run on average. GetFree's syntax, which doesn't use -prune, took 865ms. So, yes this is a rather extreme example, but if you care about run time and are doing anything remotely intensive you should use -prune.

Note Daniel C. Sobral's syntax performed the better of the two -prune syntaxes; but, I strongly suspect this is the result of some caching as switching the order in which the two ran resulted in the opposite result, while the non-prune version was always slowest.

Test Script

#!/bin/bash

dir='.performance_test'

setup() {
  mkdir "$dir" || exit 1
  mkdir -p "$dir/prune_me/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/w/x/y/z" \
    "$dir/other"

  find "$dir/prune_me" -depth -type d -exec mkdir '{}'/{A..Z} \;
  find "$dir/prune_me" -type d -exec touch '{}'/{1..1000} \;
  touch "$dir/other/foo"
}

cleanup() {
  rm -rf "$dir"
}

stats() {
  for file in "$dir"/*; do
    if [[ -d "$file" ]]; then
      count=$(find "$file" | wc -l)
      printf "%-30s %-10s\n" "$file" "$count"
    fi
  done
}

name1() {
  find "$dir" -path "$dir/prune_me" -prune -o -exec bash -c 'echo "$0"'  {} \;
}

name2() {
  find "$dir" -not \( -path "$dir/prune_me" -prune \) -exec bash -c 'echo "$0"' {} \;
}

name3() {
  find "$dir" -not -path "$dir/prune_me*" -exec bash -c 'echo "$0"' {} \;
}

printf "Setting up test files...\n\n"
setup
echo "----------------------------------------------"
echo "# of files/dirs in level one directories"
stats | sort -k 2 -n -r
echo "----------------------------------------------"

printf "\nRunning performance test...\n\n"

echo \> find \""$dir"\" -path \""$dir/prune_me"\" -prune -o -exec bash -c \'echo \"\$0\"\'  {} \\\;
name1
s=$(date +%s%N)
name1_num=$(name1 | wc -l)
e=$(date +%s%N)
name1_perf=$((e-s))
printf "  [# of files] $name1_num [Runtime(ns)] $name1_perf\n\n"

echo \> find \""$dir"\" -not \\\( -path \""$dir/prune_me"\" -prune \\\) -exec bash -c \'echo \"\$0\"\' {} \\\;
name2
s=$(date +%s%N)
name2_num=$(name2 | wc -l)
e=$(date +%s%N)
name2_perf=$((e-s))
printf "  [# of files] $name2_num [Runtime(ns)] $name2_perf\n\n"

echo \> find \""$dir"\" -not -path \""$dir/prune_me*"\" -exec bash -c \'echo \"\$0\"\' {} \\\;
name3
s=$(date +%s%N)
name3_num=$(name3 | wc -l)
e=$(date +%s%N)
name3_perf=$((e-s))
printf "  [# of files] $name3_num [Runtime(ns)] $name3_perf\n\n"

echo "Cleaning up test files..."
cleanup
查看更多
登录 后发表回答