I created a rails app in a docker environment and it links to a postgres instance. I edited the
postgres container to add initial data (by running rake db:setup from the rails app). Now I commited the postgres database, but it doesn't seem to remember my data when I create a new container (of the commited postgres image).
Isn't it possible to save data in a commit and then reuse it afterwards?
I used the postgres image: https://registry.hub.docker.com/_/postgres/
The problem is that the postgres Dockerfile declares "/var/lib/postgresql/data" as a volume. This is a just a normal directory that lives outside of the Union File System used by images. Volumes live until no containers link to them and they are explicitly deleted.
You have a few choices:
- Use the
--volumes-from
command to share data with new containers. This will only work if there is only one running postgres image at a time, but it is the best solution.
- Write your own Dockerfile which creates the data before declaring the volume. This data will then be copied into the volume when the container is created.
- Write an entrypoint or cmd script which populates the database at run time.
All of these suggestions require you to use Volumes to manage the data once the container is running. Alternatively, you could write your own Dockerfile and simply not declare a volume. You could then use docker commit
to create a new image after adding data. This will probably work in the short term, but is definitely not how you should work with containers - it isn't repeatable and you will eventually run out of layers in the Union File System.
Have a look at the official Docker docs on managing data in containers for more info.
Create a new Dockerfile and change PGDATA
:
FROM postgres:9.2.10
RUN mkdir -p /var/lib/postgresql-static/data
ENV PGDATA /var/lib/postgresql-static/data
It is not possible to save data during a commit since the data resides on a mount which is specific for that container and will get removed once you run docker rm <container ID>
but you can use data volumes to share and reuse data between container and the changes made are directly on the volume.
You can use docker run -v /host/path:/Container/path
to mount the volume to the new container.
Please refer to: https://docs.docker.com/userguide/dockervolumes/
For keeping permanent data such as databases, you should define these data volumes as external, therefore it will not be removed or created automatically every time you run docker-compose up or down commands, or redeploy your stack to the swarm.
...
volumes:
db-data:
external: true
...
then you should create this volume:
docker volume create db-data
and use it as data volume for your databse:
...
db:
image: postgres:latest
volumes:
- db-data:/var/lib/postgresql/data
ports:
- 5432:5432
...
In production, there are many factors to consider when using docker for keeping permanent data safely, specially in swarm mode, or in kubernetes cluster.