可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I'm trying to find the files existing in one directory but not in the other, I tried to use this command:
diff -q dir1 dir2
The problem with the above command that it finds both the files in dir1
but not in dir2
as well as the files in dir2
but not in dir1
,
I am trying to find the files in dir1
but not in dir2
only.
Here's a small sample of what my data looks like
dir1 dir2 dir3
1.txt 1.txt 1.txt
2.txt 3.txt 3.txt
5.txt 4.txt 5.txt
6.txt 7.txt 8.txt
Another question on my mind is how can I find the files in dir1
but not in dir2
or dir3
in a single command?
回答1:
diff -r dir1 dir2 | grep dir1 | awk '{print $4}' > difference1.txt
Explanation:
diff -r dir1 dir2
shows which files are only in dir1 and those only in dir2 and also the changes of the files present in both directories if any.
diff -r dir1 dir2 | grep dir1
shows which files are only in dir1
awk
to print only filename.
回答2:
This should do the job:
diff -rq dir1 dir2
Options explained (via diff(1) man page):
-r
- Recursively compare any subdirectories found.
-q
- Output only whether files differ.
回答3:
comm -23 <(ls dir1 |sort) <(ls dir2|sort)
This command will give you files those are in dir1 and not in dir2.
About <( )
sign, you can google it as 'process substitution'.
回答4:
A good way to do this comparison is to use find
with md5sum
, then a diff
.
Example:
Use find
to list all the files in the directory then calculate the md5 hash for each file and pipe it to a file:
find /dir1/ -type f -exec md5sum {} \; > dir1.txt
Do the same procedure to the another directory:
find /dir2/ -type f -exec md5sum {} \; > dir2.txt
Then compare the result two files with "diff":
diff dir1.txt dir2.txt
This strategy is very useful when the two directories to be compared are not in the same machine and you need to make sure that the files are equal in both directories.
Another good way to do the job is using git
git diff --no-index dir1/ dir2/
Best regards!
回答5:
Meld (http://meldmerge.org/) does a great job at comparing directories and the files within.
回答6:
vim's DirDiff plugin is another very useful tool for comparing directories.
vim -c "DirDiff dir1 dir2"
It not only lists which files are different between the directories, but also allows you to inspect/modify with vimdiff the files that are different.
回答7:
Unsatisfied with all the replies, since most of them work very slowly and produce unnecessarily long output for large directories, I wrote my own Python script to compare two folders.
Unlike many other solutions, it doesn't compare contents of the files. Also it doesn't go inside subdirectories which are missing in another directory. So the output is quite concise and the script works fast.
#!/usr/bin/env python3
import os, sys
def compare_dirs(d1: "old directory name", d2: "new directory name"):
def print_local(a, msg):
print('DIR ' if a[2] else 'FILE', a[1], msg)
# ensure validity
for d in [d1,d2]:
if not os.path.isdir(d):
raise ValueError("not a directory: " + d)
# get relative path
l1 = [(x,os.path.join(d1,x)) for x in os.listdir(d1)]
l2 = [(x,os.path.join(d2,x)) for x in os.listdir(d2)]
# determine type: directory or file?
l1 = sorted([(x,y,os.path.isdir(y)) for x,y in l1])
l2 = sorted([(x,y,os.path.isdir(y)) for x,y in l2])
i1 = i2 = 0
common_dirs = []
while i1<len(l1) and i2<len(l2):
if l1[i1][0] == l2[i2][0]: # same name
if l1[i1][2] == l2[i2][2]: # same type
if l1[i1][2]: # remember this folder for recursion
common_dirs.append((l1[i1][1], l2[i2][1]))
else:
print_local(l1[i1],'type changed')
i1 += 1
i2 += 1
elif l1[i1][0]<l2[i2][0]:
print_local(l1[i1],'removed')
i1 += 1
elif l1[i1][0]>l2[i2][0]:
print_local(l2[i2],'added')
i2 += 1
while i1<len(l1):
print_local(l1[i1],'removed')
i1 += 1
while i2<len(l2):
print_local(l2[i2],'added')
i2 += 1
# compare subfolders recursively
for sd1,sd2 in common_dirs:
compare_dirs(sd1, sd2)
if __name__=="__main__":
compare_dirs(sys.argv[1], sys.argv[2])
Sample usage:
user@laptop:~$ python3 compare_dirs.py dir1/ dir2/
DIR dir1/out/flavor-domino removed
DIR dir2/out/flavor-maxim2 added
DIR dir1/target/vendor/flavor-domino removed
DIR dir2/target/vendor/flavor-maxim2 added
FILE dir1/tmp/.kconfig-flavor_domino removed
FILE dir2/tmp/.kconfig-flavor_maxim2 added
DIR dir2/tools/tools/LiveSuit_For_Linux64 added
Or if you want to see only files from the first directory:
user@laptop:~$ python3 compare_dirs.py dir2/ dir1/ | grep dir1
DIR dir1/out/flavor-domino added
DIR dir1/target/vendor/flavor-domino added
FILE dir1/tmp/.kconfig-flavor_domino added
P.S. If you need to compare file sizes and file hashes for potential changes, I published an updated script here: https://gist.github.com/amakukha/f489cbde2afd32817f8e866cf4abe779
回答8:
Another (maybe faster for large directories) approach:
$ find dir1 | sed 's,^[^/]*/,,' | sort > dir1.txt && find dir2 | sed 's,^[^/]*/,,' | sort > dir2.txt
$ diff dir1.txt dir2.txt
The sed
command removes the first directory component thanks to Erik`s post)
回答9:
This is a bit late but may help someone. Not sure if diff or rsync spit out just filenames in a bare format like this. Thanks to plhn for giving that nice solution which I expanded upon below.
If you want just the filenames so it's easy to just copy the files you need in a clean format, you can use the find command.
comm -23 <(find dir1 | sed 's/dir1/\//'| sort) <(find dir2 | sed 's/dir2/\//'| sort) | sed 's/^\//dir1/'
This assumes that both dir1 and dir2 are in the same parent folder. sed just removes the parent folder so you can compare apples with apples. The last sed just puts the dir1 name back.
If you just want files:
comm -23 <(find dir1 -type f | sed 's/dir1/\//'| sort) <(find dir2 -type f | sed 's/dir2/\//'| sort) | sed 's/^\//dir1/'
Similarly for directories:
comm -23 <(find dir1 -type d | sed 's/dir1/\//'| sort) <(find dir2 -type d | sed 's/dir2/\//'| sort) | sed 's/^\//dir1/'
回答10:
The accepted answer will also list the files that exist in both directories, but have different content. To list ONLY the files that exist in dir1 you can use:
diff -r dir1 dir2 | grep 'Only in' | grep dir1 | awk '{print $4}' > difference1.txt
Explanation:
- diff -r dir1 dir2 : compare
- grep 'Only in': get lines that contain 'Only in'
- grep dir1 : get lines that contain dir
回答11:
This is the bash script to print commands for syncing two directories
dir1=/tmp/path_to_dir1
dir2=/tmp/path_to_dir2
diff -rq $dir1 $dir2 | sed -e "s|Only in $dir2\(.*\): \(.*\)|cp -r $dir2\1/\2 $dir1\1|" | sed -e "s|Only in $dir1\(.*\): \(.*\)|cp -r $dir1\1/\2 $dir2\1|"
回答12:
This answer optimizes one of the suggestions from @Adail-Junior by adding the -D
option, which is helpful when neither of the directories being compared are git repositories:
git diff -D --no-index dir1/ dir2/
If you use -D
then you won't see comparisons to /dev/null
:
text
Binary files a/whatever and /dev/null differ
回答13:
A simplified way to compare 2 directories using the DIFF command
diff filename.1 filename.2 > filename.dat >>Enter
open filename.dat after the run is complete
and you will see:
Only in filename.1: filename.2
Only in: directory_name: name_of_file1
Only in: directory_Name: name_of_file2
回答14:
GNU grep
can inverse the search with the option -v
. This makes grep
reporting the lines, which do not match. By this you can remove the files in dir2
from the list of files in dir1
.
grep -v -F -x -f <(find dir2 -type f -printf '%P\n') <(find dir1 -type f -printf '%P\n')
The options -F -x
tell grep
to perform a string search on the whole line.
回答15:
kdiff3 has a nice diff interface for files and directories.
It works on Windows, Linux, and macOS.
You can install it in multiple ways:
- Windows
- 64-bit installer
- 32-bit installer
- macOS
- Binary
- Homebrew Cask:
brew cask install kdiff3