Original question: How to use the VOLUME instruction in Dockerfile?
Revised: update from the answer below, so the actual question I want to solve is -- how to mount host volumes into docker containers in Dockerfile during build, i.e., having the docker run -v /export:/export
capability during docker build
.
Latest Update: See the newly accepted answer, e.g., the Buildkit in v18.09.
Was: There had been a solution -- rocker, which was not from Docker, but now that rocker is discontinued, I revert the answer back to "Not possible" again.
Update: So the answer is "Not possible". I can accept it as an answer as I know the issue has been extensively discussed at https://github.com/docker/docker/issues/3156. I can understand that portability is a paramount issue for docker developer; but as a docker user, I have to say I'm very disappointed about this missing feature. Let me close my argument with a quote from aforementioned discussion: "I would like to use Gentoo as a base image but definitely don't want > 1GB of Portage tree data to be in any of the layers once the image has been built. You could have some nice a compact containers if it wasn't for the gigantic portage tree having to appear in the image during the install." Yes, I can use wget or curl to download whatever I need, but the fact that merely a portability consideration is now forcing me to download > 1GB of Portage tree each time I build a Gentoo base image is neither efficient nor user friendly. Further more, the package repository WILL ALWAYS be under /usr/portage, thus ALWAYS PORTABLE under Gentoo. Again, I respect the decision, but please allow me expressing my disappointment as well in the mean time. Thanks.
Original question in details:
From
Share Directories via Volumes
http://docker.readthedocs.org/en/v0.7.3/use/working_with_volumes/
it says that Data volumes feature "have been available since version 1 of the Docker Remote API". My docker is of version 1.2.0, but I found the example given in above article not working:
# BUILD-USING: docker build -t data .
# RUN-USING: docker run -name DATA data
FROM busybox
VOLUME ["/var/volume1", "/var/volume2"]
CMD ["/usr/bin/true"]
What's the proper way in Dockerfile to mount host-mounted volumes into docker containers, via the VOLUME command?
$ apt-cache policy lxc-docker
lxc-docker:
Installed: 1.2.0
Candidate: 1.2.0
Version table:
*** 1.2.0 0
500 https://get.docker.io/ubuntu/ docker/main amd64 Packages
100 /var/lib/dpkg/status
$ cat Dockerfile
FROM debian:sid
VOLUME ["/export"]
RUN ls -l /export
CMD ls -l /export
$ docker build -t data .
Sending build context to Docker daemon 2.56 kB
Sending build context to Docker daemon
Step 0 : FROM debian:sid
---> 77e97a48ce6a
Step 1 : VOLUME ["/export"]
---> Using cache
---> 59b69b65a074
Step 2 : RUN ls -l /export
---> Running in df43c78d74be
total 0
---> 9d29a6eb263f
Removing intermediate container df43c78d74be
Step 3 : CMD ls -l /export
---> Running in 8e4916d3e390
---> d6e7e1c52551
Removing intermediate container 8e4916d3e390
Successfully built d6e7e1c52551
$ docker run data
total 0
$ ls -l /export | wc
20 162 1131
$ docker -v
Docker version 1.2.0, build fa7b24f
It is not possible to use the
VOLUME
instruction to tell docker what to mount. That would seriously break portability. This instruction tells docker that content in those directories does not go in images and can be accessed from other containers using the--volumes-from
command line parameter. You have to run the container using-v /path/on/host:/path/in/container
to access directories from the host.Mounting host volumes during build is not possible. There is no privileged build and mounting the host would also seriously degrade portability. You might want to try using wget or curl to download whatever you need for the build and put it in place.
I think you can do what you want to do by running the build via a docker command which itself is run inside a docker container. See Docker can now run within Docker | Docker Blog. A technique like this, but which actually accessed the outer docker from with a container, was used, e.g., while exploring how to Create the smallest possible Docker container | Xebia Blog.
Another relevant article is Optimizing Docker Images | CenturyLink Labs, which explains that if you do end up downloading stuff during a build, you can avoid having space wasted by it in the final image by downloading, building and deleting the download all in one RUN step.
It's ugly, but I achieved a semblance of this like so:
Dockerfile:
imageBuild.sh:
I have a java build that downloads the universe into /root/.m2, and did so every single time.
imageBuild.sh
copies the contents of that folder onto the host after the build, andDockerfile
copies them back into the image for the next build.This is something like how a volume would work (i.e. it persists between builds).
UPDATE: Somebody just won't take no as the answer, and I like it, very much, especially to this particular question.
GOOD NEWS, There is a way now --
The solution is Rocker: https://github.com/grammarly/rocker
John Yani said, "IMO, it solves all the weak points of Dockerfile, making it suitable for development."
Rocker
https://github.com/grammarly/rocker
First, to answer "why doesn't
VOLUME
work?" When you define aVOLUME
in the Dockerfile, you can only define the target, not the source of the volume. During the build, you will only get an anonymous volume from this. That anonymous volume will be mounted at everyRUN
command, prepopulated with the contents of the image, and then discarded at the end of theRUN
command. Only changes to the container are saved, not changes to the volume.Since this question has been asked, a few features have been released that may help. First is multistage builds allowing you to build a disk space inefficient first stage, and copy just the needed output to the final stage that you ship. And the second feature is Buildkit which is dramatically changing how images are built and new capabilities are being added to the build.
For a multi-stage build, you would have multiple
FROM
lines, each one starting the creation of a separate image. Only the last image is tagged by default, but you can copy files from previous stages. The standard use is to have a compiler environment to build a binary or other application artifact, and a runtime environment as the second stage that copies over that artifact. You could have:That would result in a build that only contains the resulting binary, and not the full /export directory.
Buildkit is coming out of experimental in 18.09. It's a complete redesign of the build process, including the ability to change the frontend parser. One of those parser changes has has implemented the
RUN --mount
option which lets you mount a cache directory for your run commands. E.g. here's one that mounts some of the debian directories (with a reconfigure of the debian image, this could speed up reinstalls of packages):You would adjust the cache directory for whatever application cache you have, e.g. $HOME/.m2 for maven, or /root/.cache for golang.
TL;DR: Answer is here: With that
RUN --mount
syntax, you can also bind mount read-only directories from the build-context. The folder must exist in the build context, and it is not mapped back to the host or the build client:Note that because the directory is mounted from the context, it's also mounted read-only, and you cannot push changes back to the host or client. When you build, you'll want an 18.09 or newer install and enable build kit with
export DOCKER_BUILDKIT=1
.There is a way to mount a volume during a build, but it doesn't involve Dockerfiles.
The technique would be to create a container from whatever base you wanted to use (mounting your volume(s) in the container with the
-v
option), run a shell script to do your image building work, then commit the container as an image when done.Not only will this leave out the excess files you don't want (this is good for secure files as well, like SSH files), it also creates a single image. It has downsides: the commit command doesn't support all of the Dockerfile instructions, and it doesn't let you pick up when you left off if you need to edit your build script.