How to compare two tarball's content

2019-03-09 18:26发布

I want to tell whether two tarball files contain identical files, in terms of file name and file content, not including meta-data like date, user, group.

However, There are some restrictions: first, I have no control of whether the meta-data is included when making the tar file, actually, the tar file always contains meta-data, so directly diff the two tar files doesn't work. Second, since some tar files are so large that I cannot afford to untar them in to a temp directory and diff the contained files one by one. (I know if I can untar file1.tar into file1/, I can compare them by invoking 'tar -dvf file2.tar' in file/. But usually I cannot afford untar even one of them)

Any idea how I can compare the two tar files? It would be better if it can be accomplished within SHELL scripts. Alternatively, is there any way to get each sub-file's checksum without actually untar a tarball?

Thanks,

11条回答
来,给爷笑一个
2楼-- · 2019-03-09 18:53

One can use a simple script:

#!/usr/bin/env bash
set -eu

tar1=$1
tar2=$2
shift 2
tar_opts=("$@")

tmp1=`mktemp -d`
_trap="rm -r "$tmp1"; ${_trap:-}" && trap "$_trap" EXIT
tar xf "$tar1" -C "$tmp1"

tmp2=`mktemp -d`
_trap="rm -r "$tmp2"; ${_trap:-}" && trap "$_trap" EXIT
tar xf "$tar2" -C "$tmp2"

diff -ur "${tar_opts[@]:+${tar_opts[@]}}" "$tmp1" "$tmp2"

Usage:

diff-tars.sh TAR1 TAR2 [DIFF_OPTS]
查看更多
仙女界的扛把子
3楼-- · 2019-03-09 18:59

Is tardiff what you're looking for? It's "a simple perl script" that "compares the contents of two tarballs and reports on any differences found between them."

查看更多
混吃等死
4楼-- · 2019-03-09 19:01

If not extracting the archives nor needing the differences, try diff's -q option:

diff -q 1.tar 2.tar

This quiet result will be "1.tar 2.tar differ" or nothing, if no differences.

查看更多
祖国的老花朵
5楼-- · 2019-03-09 19:03

Here is my variant, it is checking the unix permission too:

Works only if the filenames are shorter than 200 char.

diff <(tar -tvf 1.tar | awk '{printf "%10s %200s %10s\n",$3,$6,$1}'|sort -k2) <(tar -tvf 2.tar|awk '{printf "%10s %200s %10s\n",$3,$6,$1}'|sort -k2)
查看更多
贼婆χ
6楼-- · 2019-03-09 19:03

There is tool called archdiff. It is basically a perl script that can look into the archives.

Takes two archives, or an archive and a directory and shows a summary of the
differences between them.
查看更多
登录 后发表回答