I want to tell whether two tarball files contain identical files, in terms of file name and file content, not including meta-data like date, user, group.
However, There are some restrictions: first, I have no control of whether the meta-data is included when making the tar file, actually, the tar file always contains meta-data, so directly diff the two tar files doesn't work. Second, since some tar files are so large that I cannot afford to untar them in to a temp directory and diff the contained files one by one. (I know if I can untar file1.tar into file1/, I can compare them by invoking 'tar -dvf file2.tar' in file/. But usually I cannot afford untar even one of them)
Any idea how I can compare the two tar files? It would be better if it can be accomplished within SHELL scripts. Alternatively, is there any way to get each sub-file's checksum without actually untar a tarball?
Thanks,
One can use a simple script:
Usage:
Is tardiff what you're looking for? It's "a simple perl script" that "compares the contents of two tarballs and reports on any differences found between them."
If not extracting the archives nor needing the differences, try diff's -q option:
diff -q 1.tar 2.tar
This quiet result will be "1.tar 2.tar differ" or nothing, if no differences.
Here is my variant, it is checking the unix permission too:
Works only if the filenames are shorter than 200 char.
There is tool called archdiff. It is basically a perl script that can look into the archives.