I want to compare the total size of two directories dir1
and dir2
on different file-systems so that if diff -r dir1 dir2
returns 0
then the total sizes will be equal. The du
command returns the disk usage, and its option --apparent-size
doesn't solve the problem. I now use something like
find dir1 ! -type d |xargs wc -c |tail -1
to know an approximation of dir1's size. Is there a better solution?
edit:
for example, I have (diff -r dir1 dir2
returns 0: they are equal):
du -s dir1 --> 540
du -s dir2 --> 166
du -sb dir1 --> 250815 (the -b option is equivalent to --apparent-size -B1)
du -sb dir2 --> 71495
find dir1 ! -type d |xargs wc -c --> 62399
find dir2 ! -type d |xargs wc -c --> 62399
i can't know what you want clearly. Maybe you want this?
diff <(du -sh dir1) <(du -sh dir2)
If your version of find
has -printf
you may find this to be quite a bit faster.
find dir1 ! -type d -printf "%s\n" | awk '{sum += $1} END{print sum}'
There are at least two ways to avoid scientific notation for outputting large numbers in AWK.
END {OFMT = "%.0f"; print sum}
END {printf "%.0f\n", sum}
The .0
truncates the decimal places since we're really dealing with an integer and gawk's %d
seems to incorrectly act like %g
in version 3.1.5 (but not 3.1.6 and later).
However, from the gawk
documentation:
NOTE: When using the integer format-control letters for values
that are outside the range of the widest C integer type, 'gawk'
switches to the '%g' format specifier.
Beware of exceeding the maximum integer for your system/version of AWK.