Understanding “VOLUME” instruction in DockerFile

2019-03-08 01:49发布

问题:

Below is the content of my "Dockerfile"

FROM node:boron

# Create app directory
RUN mkdir -p /usr/src/app

# change working dir to /usr/src/app
WORKDIR /usr/src/app

VOLUME . /usr/src/app

RUN npm install

EXPOSE 8080

CMD ["node" , "server" ]

In this file I am expecting "VOLUME . /usr/src/app" instruction to mount contents of present working directory in host to be mounted on /usr/src/app folder of container.

Please let me know if this is the correct way ?

回答1:

Official docker tutorial tells:

A data volume is a specially-designated directory within one or more containers that bypasses the Union File System. Data volumes provide several useful features for persistent or shared data:

  • Volumes are initialized when a container is created. If the container’s base image contains data at the specified mount point,
    that existing data is copied into the new volume upon volume
    initialization. (Note that this does not apply when mounting a host
    directory.)
  • Data volumes can be shared and reused among containers.

  • Changes to a data volume are made directly.

  • Changes to a data volume will not be included when you update an image.

  • Data volumes persist even if the container itself is deleted.

Into Dockerfile you can specify only destination of volume inside container. e.g. /usr/src/app.

When you run container e.g. docker run --volume=/opt:/usr/src/app my_image you may but not necessary specify mounting point (/opt) in host machine. If you not specify --volume argument then mount point will be chosen automatically



回答2:

In short: No, your VOLUME instruction is not correct.

Dockerfile's VOLUME specify one or more volumes given container-side paths. But it does not allow the image author to specify a host path. On the host-side, the volumes are created with a very long ID-like name inside the Docker root. On my machine this is /var/lib/docker/volumes.

Note: Because the autogenerated name is extremely long and makes no sense from a human's perspective, these volumes are often referred to as "unnamed" or "anonymous".

Your example that uses a '.' character will not even run on my machine, no matter if I make the dot the first or second argument. I get this error message:

docker: Error response from daemon: oci runtime error: container_linux.go:265: starting container process caused "process_linux.go:368: container init caused \"open /dev/ptmx: no such file or directory\"".

I know that what has been said to this point is probably not very valuable to someone trying to understand VOLUME and -v and it certainly does not provide a solution for what you try to accomplish. So, hopefully, the following examples will shed some more light on these issues.

Minitutorial: Specifying volumes

Given this Dockerfile:

FROM openjdk:8u131-jdk-alpine
VOLUME vol1 vol2

(For the outcome of this minitutorial, it makes no difference if we specify vol1 vol2 or /vol1 /vol2 - don't ask me why)

Build it:

docker build -t my-openjdk

Run:

docker run --rm -it my-openjdk

Inside the container, run ls in the command line and you'll notice two directories exist; /vol1 and /vol2.

Running the container also creates two directories, or "volumes", on the host-side.

While having the container running, execute docker volume ls on the host machine and you'll see something like this (I have replaced the middle part of the name with two dots for brevity):

DRIVER    VOLUME NAME
local     c984..e4fc
local     f670..49f0

Back in the container, execute touch /vol1/weird-ass-file (creates a blank file at said location).

This file is now available on the host machine, in one of the unnamed volumes lol. It took me two tries because I first tried the first listed volume, but eventually I did find my file in the second listed volume, using this command on the host machine:

sudo ls /var/lib/docker/volumes/f670..49f0/_data

Similarly, you can try to delete this file on the host and it will be deleted in the container as well.

Note: The _data folder is also referred to as a "mount point".

Exit out from the container and list the volumes on the host. They are gone. We used the --rm flag when running the container and this option effectively wipes out not just the container on exit, but also the volumes.

Run a new container, but specify a volume using -v:

docker run --rm -it -v /vol3 my-openjdk

This adds a third volume and the whole system ends up having three unnamed volumes. The command would have crashed had we specified only -v vol3. The argument must be an absolute path inside the container. On the host-side, the new third volume is anonymous and resides together with the other two volumes in /var/lib/docker/volumes/.

It was stated earlier that the Dockerfile can not map to a host path which sort of pose a problem for us when trying to bring files in from the host to the container during runtime. A different -v syntax solves this problem.

Imagine I have a subfolder in my project directory ./src that I wish to sync to /src inside the container. This command does the trick:

docker run -it -v $(pwd)/src:/src my-openjdk

Both sides of the : character expects an absolute path. Left side being an absolute path on the host machine, right side being an absolute path inside the container. pwd is a command that "print current/working directory". Putting the command in $() takes the command within parenthesis, runs it in a subshell and yields back the absolute path to our project directory.

Putting it all together, assume we have ./src/Hello.java in our project folder on the host machine with the following contents:

public class Hello {
    public static void main(String... ignored) {
        System.out.println("Hello, World!");
    }
}

We build this Dockerfile:

FROM openjdk:8u131-jdk-alpine
WORKDIR /src
ENTRYPOINT javac Hello.java && java Hello

We run this command:

docker run -v $(pwd)/src:/src my-openjdk

This prints "Hello, World!".

The best part is that we're completely free to modify the .java file with a new message for another output on a second run - without having to rebuild the image =)

Final remarks

I am quite new to Docker, and the aforementioned "tutorial" reflects information I gathered from a 3-day command line hackathon. I am almost ashamed I haven't been able to provide links to clear English-like documentation backing up my statements, but I honestly think this is due to a lack of documentation and not personal effort. I do know the examples work as advertised using my current setup which is "Windows 10 -> Vagrant 2.0.0 -> Docker 17.09.0-ce".

The tutorial does not solve the problem "how do we specify the container's path in the Dockerfile and let the run command only specify the host path". There might be a way, I just haven't found it.

Finally, I have a gut feeling that specifying VOLUME in the Dockerfile is not just uncommon, but it's probably a best practice to never use VOLUME. For two reasons. The first reason we have already identified: We can not specify the host path - which is a good thing because Dockerfiles should be very agnostic to the specifics of a host machine. But the second reason is people might forget to use the --rm option when running the container. One might remember to remove the container but forget to remove the volume. Plus, even with the best of human memory, it might be a daunting task to figure out which of all anonymous volumes are safe to remove.