Sometimes there is a need to use sensitive data when building a Docker image. For example, an API token or SSH key to download a remote file or to install dependencies from a private repository. It may be desirable to distribute the resulting image and leave out the sensitive credentials that were used to build it. How can this be done?
I have seen docker-squash which can squash multiple layers in to one, removing any deleted files from the final image. But is there a more idiomatic approach?
Sadly, there is still no proper solution for handling sensitive data while building a docker image.
This bug has a good summary of what is wrong with every hack that people suggest: https://github.com/moby/moby/issues/13490
And most advice seems to confuse secrets that need to go INTO the container with secrets that are used to build the container, like several of the answers here.
The current solutions that seem to actually be secure, all seem to center around writing out the secret file to disk or memory, and then starting a silly little HTTP server, and then having the build process pull in the secret from the http server, use it, and not store it in the image.
The best I've found without going to that level of complexity, is to (mis)use the built in predefined-args feature of docker compose files, as specified in this comment:
https://github.com/moby/moby/issues/13490#issuecomment-403612834
That does seem to keep the secrets out of the image build history.
Regarding idiomatic approach, I'm not sure, although docker is still quite young to have too many idioms about.
We have had this same issue at our company, however. We have come to the following conclusions, although these are our best efforts rather than established docker best practices.
1) If you need the values at build time: Supply a properties file in the build context with the values that can be read at build, then the properties file can be deleted after build. This isn't as portable but will do the job.
2) If you need the values at run time: Pass values as environment variables. They will be visible to someone who has access to ps on the box, but this can be restricted via SELinux or other methods (honestly, I don't know this process, I'm a developer and the operations teams will deal with that part).
The way we solve this issue is that we have a tool written on top of
docker build
. Once you initiate a build using the tool, it will download a dockerfile and alters it. It changes all instructions which require "the secret" to something like:However, this leaves the secret data available to anyone with access to the image unless the layer itself is removed with a tool like docker-squash. The command used to generate each intermediate layer can be found using the history command
Matthew Close talks about this in this blog article.
Summarized: You should use docker-compose to mount sensitive information into the container.