I've been playing around with Docker for a while and keep on finding the same issue when dealing with persistent data.
I create my Dockerfile
and expose a volume or use --volumes-from
to mount a host folder inside my container.
What permissions should I apply to the shared volume on the host?
I can think of two options:
So far I've given everyone read/write access, so I can write to the folder from the Docker container.
Map the users from host into the container, so I can assign more granular permissions. Not sure this is possible though and haven't found much about it. So far, all I can do is run the container as some user:
docker run -i -t -user="myuser" postgres
, but this user has a different UID than my hostmyuser
, so permissions do not work. Also, I'm unsure if mapping the users will pose some security risks.
Are there other alternatives?
How are you guys/gals dealing with this issue?
UPDATE 2016-03-02: As of Docker 1.9.0, Docker has named volumes which replace data-only containers. The answer below, as well as my linked blog post, still has value in the sense of how to think about data inside docker but consider using named volumes to implement the pattern described below rather than data containers.
I believe the canonical way to solve this is by using data-only containers. With this approach, all access to the volume data is via containers that use
-volumes-from
the data container, so the host uid/gid doesn't matter.For example, one use case given in the documentation is backing up a data volume. To do this another container is used to do the backup via
tar
, and it too uses-volumes-from
in order to mount the volume. So I think the key point to grok is: rather than thinking about how to get access to the data on the host with the proper permissions, think about how to do whatever you need -- backups, browsing, etc. -- via another container. The containers themselves need to use consistent uid/gids, but they don't need to map to anything on the host, thereby remaining portable.This is relatively new for me as well but if you have a particular use case feel free to comment and I'll try to expand on the answer.
UPDATE: For the given use case in the comments, you might have an image
some/graphite
to run graphite, and an imagesome/graphitedata
as the data container. So, ignoring ports and such, theDockerfile
of imagesome/graphitedata
is something like:Build and create the data container:
The
some/graphite
Dockerfile should also get the same uid/gids, therefore it might look something like this:And it would be run as follows:
Ok, now that gives us our graphite container and associated data-only container with the correct user/group (note you could re-use the
some/graphite
container for the data container as well, overriding the entrypoing/cmd when running it, but having them as separate images IMO is clearer).Now, lets say you want to edit something in the data folder. So rather than bind mounting the volume to the host and editing it there, create a new container to do that job. Lets call it
some/graphitetools
. Lets also create the appropriate user/group, just like thesome/graphite
image.You could make this DRY by inheriting from
some/graphite
orsome/graphitedata
in the Dockerfile, or instead of creating a new image just re-use one of the existing ones (overriding entrypoint/cmd as necessary).Now, you simply run:
and then
vi /data/graphite/whatever.txt
. This works perfectly because all the containers have the same graphite user with matching uid/gid.Since you never mount
/data/graphite
from the host, you don't care how the host uid/gid maps to the uid/gid defined inside thegraphite
andgraphitetools
containers. Those containers can now be deployed to any host, and they will continue to work perfectly.The neat thing about this is that
graphitetools
could have all sorts of useful utilities and scripts, that you can now also deploy in a portable manner.UPDATE 2: After writing this answer, I decided to write a more complete blog post about this approach. I hope it helps.
UPDATE 3: I corrected this answer and added more specifics. It previously contained some incorrect assumptions about ownership and perms -- the ownership is usually assigned at volume creation time i.e. in the data container, because that is when the volume is created. See this blog. This is not a requirement though -- you can just use the data container as a "reference/handle" and set the ownership/perms in another container via chown in an entrypoint, which ends with gosu to run the command as the correct user. If anyone is interested in this approach, please comment and I can provide links to a sample using this approach.
The same as you, I was looking for a way to map users/groups from host to docker containers and this is the shortest way I've found so far:
This is an extract from my docker-compose.yml.
The idea is to mount (in read-only mode) users/groups lists from the host to the container thus after the container starts up it will have the same uid->username (as well as for groups) matchings with the host. Now you can configure user/group settings for your service inside the container as if it was working on your host system.
When you decide to move your container to another host you just need to change user name in service config file to what you have on that host.
In my specific case, I was trying to build my node package with the node docker image so that I wouldn't have to install npm on the deployment server. It worked well until, outside out the container and on the host machine, I tried to move a file into the node_modules directory that the node docker image had created, to which I was denied permissions because it was owned by root. I realized that I could work around this by copying the directory out of the container onto the host machine. Via docker docs...
This is the bash code I used to change ownership of the directory created by and within the docker container.
If needed, you can remove the directory with a second docker container.
My approach is to detect current UID/GID and then create such user/group inside the container and execute the script under him, so then all files he will create will match the user who run the script:
Ok, this is now being tracked at docker issue #7198
For now, I'm dealing with this using your second option:
Dockerfile
CLI
UPDATE I'm currently more inclined to Hamy answer
If you using Docker Compose, start the container in previleged mode: