When using Docker, we start with a base image. We boot it up, create changes and those changes are saved in layers forming another image.
So eventually I have an image for my PostgreSQL instance and an image for my web application, changes to which keep on being persisted.
So the question is: What is a container?
I couldn't understand the concept of image and layer in spite of reading all the questions here and then eventually stumbled upon this excellent documentation from Docker (duh!).
The example there is really the key to understand the whole concept. It is a lengthy post, so I am summarising the key points that need to be really grasped to get clarity.
Image: A Docker image is built up from a series of read-only layers
Layer: Each layer represents an instruction in the image’s Dockerfile.
Example
: The below Dockerfile contains four commands, each of which creates a layer.Importantly, each layer is only a set of differences from the layer before it.
Understanding images cnd Containers from a size-on-disk perspective
To view the approximate size of a running container, you can use the
docker ps -s
command. You getsize
andvirtual size
as two of the outputs:Size: the amount of data (on disk) that is used for the writable layer of each container
Virtual Size: the amount of data used for the read-only image data used by the container. Multiple containers may share some or all read-only image data. Hence these are not additive. I.e. you can't add all the virtual sizes to calculate how much size on disk is used by the image
Another important concept is the copy-on-write strategy
If a file or directory exists in a lower layer within the image, and another layer (including the writable layer) needs read access to it, it just uses the existing file. The first time another layer needs to modify the file (when building the image or running the container), the file is copied into that layer and modified.
I hope that helps someone else like me.
An image is to a class as a container to an object.
A container is an instance of an image as an object is an instance of a class.
While it's simplest to think of a container as a running image, this isn't quite accurate.
An image is really a template that can be turned into a container. To turn an image into a container, the Docker engine takes the image, adds a read-write filesystem on top and initialises various settings including network ports, container name, ID and resource limits. A running container has a currently executing process, but a container can also be stopped (or exited in Docker's terminology). An exited container is not the same as an image, as it can be restarted and will retain its settings and any filesystem changes.
A Docker container is running an instance of an image. You can relate an image with a program and a container with a process :)
Maybe explaining the whole workflow can help.
Everything starts with the Dockerfile. The Dockerfile is the source code of the Image.
Once the Dockerfile is created, you build it to create the image of the container. The image is just the "compiled version" of the "source code" which is the Dockerfile.
Once you have the image of the container, you should redistribute it using the registry. The registry is like a git repository -- you can push and pull images.
Next, you can use the image to run containers. A running container is very similar, in many aspects, to a virtual machine (but without the hypervisor).
This post explains many basic things about docker containers (it is talking about Docker and Puppet, but there are many concepts that can be used in any context)
A container is just an executable binary that is to be run by the host OS under a set of restrictions that are preset using an application (e.g., docker) that knows how to tell the OS which restrictions to apply.
The typical restrictions are process-isolation related, security related (like using SELinux protection) and system-resource related (memory, disk, cpu, networking).
Until recently only kernels in Unix-based systems supported the ability to run executables under strict restrictions. That's why most container talk today involves mostly Linux or other Unix distributions.
Docker is one of those applications that knows how to tell the OS (Linux mostly) what restrictions to run an executable under. The executable is contained in the Docker image, which is just a tarfile. That executable is usually a stripped-down version of a Linux distribution (Ubuntu, centos, Debian etc) preconfigured to run one or more applications within.
Though most people use a Linux base as the executable, it can be any other binary application as long as the host OS can run it. (see creating a simple base image using scratch). Whether the binary in the docker image is an OS or simply an application, to the OS host it is just another process, a contained process ruled by preset OS boundaries.
Other applications that, like Docker, can tell the host OS which boundaries to apply to a process while it is running include LXC, libvirt, and systemd. Docker used to use these applications to indirectly interact with the Linux OS, but now Docker interacts directly with Linux using its own library called "libcontainer".
So containers are just processes running in a restricted mode, similar to what chroot used to do.
IMO what sets Docker apart from any other container technology is its repository (Docker Hub) and their management tools which makes working with containers extremely easy.
See https://en.m.wikipedia.org/wiki/Docker_(Linux_container_engine)