I am brand new to Docker and am trying to understand exactly what a Docker image is. Every single definition of a Docker image uses the term "layer", but does not seem to define what is meant by layer.
From the official Docker docs:
We’ve already seen that Docker images are read-only templates from which Docker containers are launched. Each image consists of a series of layers. Docker makes use of union file systems to combine these layers into a single image. Union file systems allow files and directories of separate file systems, known as branches, to be transparently overlaid, forming a single coherent file system.
So I ask, what is a layer (exactly); can someone give a few concrete examples of them? And how do these layers "snap together" to form an image?
They make the most sense to me with an example...
Examining layers of your own build with docker diff
Lets take a contrived example Dockerfile:
Each of those
dd
commands outputs a 1M file to the disk. Lets build the image with an extra flag to save the temporary containers:In the output, you'll see each of the running commands happen in a temporary container that we now keep instead of automatically deleting:
If you run a
docker diff
on each of those container id's, you'll see what files were created in those containers:Each line prefixed with an
A
is adding the file, theC
indicates a change to an existing file, and theD
indicates a delete.Here's the TL;DR part
Each of these container filesystem diffs above goes into one "layer" that gets assembled when you run the image as a container. The entire file is in each layer when there's an add or change, so each of those
chmod
commands, despite just changing a permission bit, results in the entire file being copied into the next layer. The deleted /data/one file is still in the previous layers, 3 times in fact, and will be copied over the network and stored on disk when you pull the image.Examining existing images
You can see the commands that goes into creating the layers of an existing image with the
docker history
command. You can also run adocker image inspect
on an image and see the list of layers under the RootFS section.Here's the history for the above image:
The newest layers are listed on top. Of note, there are two layers at the bottom that are fairly old. They come from the busybox image itself. When you build one image, you inherit all the layers of the image you specify in the
FROM
line. There are also layers being added for changes to the image meta-data, like theCMD
line. They barely take up any space and are more for record keeping of what settings apply to the image you are running.Why layers?
The layers have a couple advantages. First, they are immutable. Once created, that layer identified by a sha256 hash will never change. That immutability allows images to safely build and fork off of each other. If two dockerfiles have the same initial set of lines, and are built on the same server, they will share the same set of initial layers, saving disk space. That also means if you rebuild an image, with just the last few lines of the Dockerfile experiencing changes, only those layers need to be rebuilt and the rest can be reused from the layer cache. This can make a rebuild of docker images very fast.
Inside a container, you see the image filesystem, but that filesystem is not copied. On top of those image layers, the container mounts it's own read-write filesystem layer. Every read of a file goes down through the layers until it hits a layer that has marked the file for deletion, has a copy of the file in that layer, or the read runs out of layers to search through. Every write makes a modification in the container specific read-write layer.
Reducing layer bloat
One downside of the layers is building images that duplicate files or ship files that are deleted in a later layer. The solution is often to merge multiple commands into a single
RUN
command. Particularly when you are modifying existing files or deleting files, you want those steps to run in the same command where they were first created. A rewrite of the above Dockerfile would look like:And if you compare the resulting images:
Just by merging together some lines in the contrived example, we got the same resulting content in our image, and shrunk our image from 5MB to just the 1MB file that you see in the final image.
I think the official document gives a pretty detailed explanation: https://docs.docker.com/engine/userguide/storagedriver/imagesandcontainers/.
An image consists of many layers which usually are generated from Dockerfile, each line in Dockerfile will create a new layer, and the result is an image, which is denoted by the form
repo:tag
, likeubuntu:15.04
.For more information, please consider reading the official docs above.