how can we determine that two Docker images have exactly the same file system structure, and that the content of corresponding files is the same, irrespective of file timestamps?
I tried the image ids but they differ when building from the same Dockerfile and a clean local repository: I did this test by building one image, cleaning the local repository, then touching one of the files to change its modification date, then building the second image, and their image ids do not match. I used Docker 17.06 (the latest version I believe).
Thanks
After some research I came up with a solution which is fast and clean per my tests.
The overall solution is this:
docker create ...
docker export ...
And that's it.
Technically, this can be done as follows:
1) Create file
md5docker
, and give it execution rights, e.g.,chmod +x md5docker
:2) Create file
tarcat
, and give it execution rights, e.g.,chmod +x tarcat
:3) Now invoke
./md5docker <image>
, where<image>
is your image name or id, to compute an MD5 hash of the entire file system of your image.To verify if two images have the same contents just check that their hashes are equal as computed in step 3).
Note that this solution only considers as content directory structure, regular file contents, and symlinks (soft and hard). If you need more just change the
tarcat
script by adding moreelif
clauses testing for the content you wish to include (see Python's tarfile, and look for methodsTarInfo.isXXX()
corresponding to the needed content).The only limitation I see in this solution is its dependency on Python (I am using Python3, but it should be very easy to adapt to Python2). A better solution without any dependency, and probably faster (hey, this is already very fast), is to write the
tarcat
script in a language supporting static linking so that a standalone executable file was enough (i.e., one not requiring any external dependencies, but the sole OS). I leave this as a future exercise in C, Rust, OCaml, Haskell, you choose.Note, if MD5 does not suit your needs, just replace
md5
inside the first script with your hash utility.Hope this helps anyone reading.
If you want to compare content of images you can use
docker inspect <imageName>
command and you can look at section RootFSdocker inspect redis
if all layers are identical then images contains identical content
There doesn't seem to be a standard way for doing this. The best way that I can think of is using the Docker multistage build feature. For example, here I am comparing the apline and debian images. In yourm case set the image names to the ones you want to compare
I basically copy all the file from each image into a git repository and commit after each copy.
This will give you an image with a git repository that records the differences between the two images.
Now if you do a
git log
you can see the logs and you can compare the two commits usinggit diff <commit1> <commit2>
Note: If the image building fails at the second commit, this means that the images are identical, since a git commit will fail if there are no changes to commit.