Docker root access to host system

2020-05-20 07:04发布

问题:

When I run a container as a normal user I can map and modify directories owned by root on my host filesystem. This seems to be a big security hole. For example I can do the following:

$ docker run -it --rm -v /bin:/tmp/a debian
root@14da9657acc7:/# cd /tmp/a
root@f2547c755c14:/tmp/a# mv df df.orig
root@f2547c755c14:/tmp/a# cp ls df
root@f2547c755c14:/tmp/a# exit

Now my host filesystem will execute the ls command when df is typed (mostly harmless example). I cannot believe that this is the desired behavior, but it is happening in my system (debian stretch). The docker command has normal permissions (755, not setuid).

What am I missing?

Maybe it is good to clarify a bit more. I am not at the moment interested in what the container itself does or can do, nor am I concerned with the root access inside the container.

Rather I notice that anyone on my system that can run a docker container can use it to gain root access to my host system and read/write as root whatever they want: effectively giving all users root access. That is obviously not what I want. How to prevent this?

回答1:

There are many Docker security features available to help with Docker security issues. The specific one that will help you is User Namespaces.

Basically you need to enable User Namespaces on the host machine with the Docker daemon stopped beforehand:

dockerd --userns-remap=default &

Note this will forbid the container from running in privileged mode (a good thing from a security standpoint) and restart the Docker daemon (it should be stopped before performing this command). When you enter the Docker container, you can restrict it to the current non-privileged user:

docker run -it --rm -v /bin:/tmp/a --user UID:GID debian

Regardless, try to enter the Docker container afterwards with your default command of

docker run -it --rm -v /bin:/tmp/a debian

If you attempt to manipulate the host filesystem that was mapped into a Docker volume (in this case /bin) where files and directories are owned by root, then you will receive a Permission denied error. This proves that User Namespaces provide the security functionality you are looking for.

I recommend going through the Docker lab on this security feature at https://github.com/docker/labs/tree/master/security/userns. I have done all of the labs and opened Issues and PRs there to ensure the integrity of the labs there and can vouch for them.



回答2:

Access to run docker commands on a host is access to root on that host. This is the design of the tool since the functionality to mount filesystems and isolate an application requires root capabilities on linux. The security vulnerability here is any sysadmin that grants access to users to run docker commands that they wouldn't otherwise trust with root access on that host. Adding users to the docker group should therefore be done with care.

I still see Docker as a security improvement when used correctly, since applications run inside a container are restricted from what they can do to the host. The ability to cause damage is given with explicit options to running the container, like mounting the root filesystem as a rw volume, direct access to devices, or adding capabilities to root that permit escaping the namespace. Barring the explicit creation of those security holes, an application run inside a container has much less access than it would if it was run outside of the container.

If you still want to try locking down users with access to docker, there are some additional security features. User namespacing is one of those which prevents root inside of the container from having root access on the host. There's also interlock which allows you to limit the commands available per user.



回答3:

You're missing that containers run as uid 0 internally by default. So this is expected. If you want to restrict the permission more inside the container, build it with a USER statement in Dockerfile. This will setuid to the named user at runtime, instead of running as root.

Note that the uid of this user it not necessarily predictable, as it is assigned inside the image you build, and it won't necessarily map to anything on the outside system. However, the point is, it won't be root.

Refer to Dockerfile reference for more information.



标签: docker