A. Here is how I created the image:
- Got latest Ubuntu image
- Ran as container and attached to it
- Cloned source code from git inside docker container
- Tagged and pushed docker image to my registry
B. And from a different machine I pulled, changed and pushed it by doing:
- Docker pull from the registry
- Start container with the pulled image and attach to it
- Change something in the cloned git directory
- Stop container, tag and push it to registry
Now the issue I'm seeing is that every time B is repeated it will try to upload ~600MB (which is the public image layer) to the registry which takes a long time in my case.
Is there any way to avoid uploading the whole 600MB and instead pushing the only directory that has changed?
What am I doing wrong? How do you guys use docker for frequent pushes?
Docker will only push changed layers, so it looks as though something in your workflow is not quite right. It will be much clearer if you use a Dockerfile
, as each instruction explicitly creates a layer, but even with docker commit
the results should be the same.
Example - run a container from the ubuntu
image and run apt-get update
and then commit the container to a new image. Now run docker history
and you'll see the new images adds a layer on top of the bash image, which has the additional state from running the APT update:
> docker history sixeyed/temp1
IMAGE CREATED CREATED BY SIZE COMMENT
2d98a4114b7c About a minute ago /bin/bash 22.2 MB
14b59d36bae0 7 months ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0 B
<missing> 7 months ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/ 1.895 kB
<missing> 7 months ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 194.5 kB
<missing> 7 months ago /bin/sh -c #(nop) ADD file:620b1d9842ebe18eaa 187.8 MB
In this case, the diff between ubuntu
and my temp1
image is the 22MB layer 2d98
.
Now if I run a new container from temp1
, create an empty file and run docker commit
to create a new image, the new layer only has the changed file:
> docker history sixeyed/temp2
IMAGE CREATED CREATED BY SIZE COMMENT
e9ea4b4963e4 45 seconds ago /bin/bash 0 B
2d98a4114b7c About a minute ago /bin/bash 22.2 MB
14b59d36bae0 7 months ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0 B
<missing> 7 months ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/ 1.895 kB
<missing> 7 months ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 194.5 kB
<missing> 7 months ago /bin/sh -c #(nop) ADD file:620b1d9842ebe18eaa 187.8 MB
When I push
the first image, only the 22MB layer will get uploaded - the others are mounted from ubuntu
, which is already in the Hub. If I push the second image, only the changed layer gets pushed - the temp1
layer is mounted from the first push:
> docker push sixeyed/temp2
The push refers to a repository [docker.io/sixeyed/temp2]
f741d3d3ee9e: Pushed
64f89772a568: Mounted from sixeyed/temp1
5f70bf18a086: Mounted from library/ubuntu
6f32b23ac95d: Mounted from library/ubuntu
14d918629d81: Mounted from library/ubuntu
fd0e26195ab2: Mounted from library/ubuntu
So if your pushes are uploading 600MB, you're either making 600MB changes to the image, or your workflow is preventing Docker using layers correctly.
Docker already uploads only the changed layer.
It is similar to how Docker build only rebuilds the cache invalidated layers. Of course it has to communicate with the registry which layers are available (it reports as Already pushed
). And if you have changed the sequence of your operations in the Dockerfile, they are absolutely new layers and all of them will be re-uploaded obviously.
FROM ubuntu
RUN echo "hello"
EXPOSE 80
and
FROM ubuntu
EXPOSE 80
RUN echo "hello"
These two images are miles apart even though the behavioral end result is same. So take care about such things.