I have a problem where my containers became too heavy and many of them have a lot of the same dependencies.
I would like to make a base container that would install and hold all the dependencies and then have the other containers pointing to the dependencies dir (using volumes) on that base container.
Im trying to do a small POC on this and I started by trying to have one container installing a python package, then the other container running a python script using that module.
I'm thinking that I will make a directory in the host that will be mounted on all containers, and will contain all the data and dependencies that are needed.
I should note that I can't use docker compose even though that probably be better to.
This is the Dockerfile for my base container:
FROM python:3.6-slim
RUN apt-get update && apt-get install -y vim
RUN pip install --install-option="--prefix=/volumes/shared/dependencies" deepdiff
CMD tail -f /dev/null
You can see that pip will install to the /volumes/shared/dependencies
dir.
I run it like this:
docker build -t base_container .
docker run -ti -v "$PWD/shared/base_dependencies":/volumes/shared/dependencies base_container
Now if I go into the container to /volumes/shared/dependencies
I see the files I put in the host dir, but not the installed package. On the other hand, if the host dir is empty, I see the installed package.
I also tried applying 2 volumes (one for the files going in and ones for the files that the container will create)
How can I get a two way volume in that situation, an explanation on why this is happening will also be nice.
When you do docker run
with a volume, it will first create the directory on your host machine if it does not exist, then mount the volume, thus reading it. So the thing is, the target directory in the container will be replaced by the one on the host, resulting in an empty directory.
Just copy the dependency at "runtime", and you don't need the container anymore with tail -f
FROM python:3.6-slim
RUN apt-get update && apt-get install -y vim
RUN pip install --install-option="--prefix=/temp" deepdiff
CMD cp -pr /temp /volumes/shared/dependencies
One productive approach you can take is to build a common base image that contains your dependencies, and then build applications on top of that. I'll show this using multi-stage Dockerfile syntax, but you can do something similar with a totally separate base image.
FROM python:3 AS base
RUN pip3 install numpy \
&& pip3 install scipy
FROM base
WORKDIR /app
COPY requirements.txt ./
RUN pip3 install -r requirements.txt
COPY . ./
CMD ["./myscript.py"]
If you had multiple applications that all needed the same large base libraries, they could all be built FROM
the same base image, and they would share the layers in that image. (Depends a little bit on your repository setup.) If you then updated the base image, you'd have to rebuild applications on top of it, but at the same time, things that haven't been updated are protected from surprise changes underneath them.
Do not share code via volumes. Especially, do not have an image's ability to run at all depend on some external resource the image doesn't control; that breaks the point of Docker's isolation.
As a minimal example of what goes wrong with a volume-based approach:
docker run -d -v ./site-packages:/usr/local/lib/python3/site-packages ...
rm -rf site-packages/some-lib
# what now?