How to avoid reinstalling packages when building D

2019-01-21 03:28发布

问题:

My Dockerfile is something like

FROM my/base

ADD . /srv
RUN pip install -r requirements.txt
RUN python setup.py install

ENTRYPOINT ["run_server"]

Every time I build a new image, dependencies have to be reinstalled, which could be very slow in my region.

One way I think of to cache packages that have been installed is to override the my/base image with newer images like this:

docker build -t new_image_1 .
docker tag new_image_1 my/base

So next time I build with this Dockerfile, my/base already has some packages installed.

But this solution has two problems:

  1. It is not always possible to override a base image
  2. The base image grow bigger and bigger as newer images are layered on it

So what better solution could I use to solve this problem?

EDIT##:

Some information about the docker on my machine:

☁  test  docker version
Client version: 1.1.2
Client API version: 1.13
Go version (client): go1.2.1
Git commit (client): d84a070
Server version: 1.1.2
Server API version: 1.13
Go version (server): go1.2.1
Git commit (server): d84a070
☁  test  docker info
Containers: 0
Images: 56
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Dirs: 56
Execution Driver: native-0.2
Kernel Version: 3.13.0-29-generic
WARNING: No swap limit support

回答1:

Try to build with below Dockerfile.

FROM my/base

WORKDIR /srv
ADD ./requirements.txt /srv/requirements.txt
RUN pip install -r requirements.txt
ADD . /srv
RUN python setup.py install

ENTRYPOINT ["run_server"]

If there are some changes on .(your project), docker skip pip install line by using cache.

Docker only run pip install on build when you edit requirements.txt file.


I write simple Hello, World! program.

$ tree
.
├── Dockerfile
├── requirements.txt
└── run.py   

0 directories, 3 file

# Dockerfile

FROM dockerfile/python
WORKDIR /srv
ADD ./requirements.txt /srv/requirements.txt
RUN pip install -r requirements.txt
ADD . /srv
CMD python /srv/run.py

# requirements.txt
pytest==2.3.4

# run.py
print("Hello, World")

Below is output.

Step 1 : WORKDIR /srv
---> Running in 22d725d22e10
---> 55768a00fd94
Removing intermediate container 22d725d22e10
Step 2 : ADD ./requirements.txt /srv/requirements.txt
---> 968a7c3a4483
Removing intermediate container 5f4e01f290fd
Step 3 : RUN pip install -r requirements.txt
---> Running in 08188205e92b
Downloading/unpacking pytest==2.3.4 (from -r requirements.txt (line 1))
  Running setup.py (path:/tmp/pip_build_root/pytest/setup.py) egg_info for package pytest
....
Cleaning up...
---> bf5c154b87c9
Removing intermediate container 08188205e92b
Step 4 : ADD . /srv
---> 3002a3a67e72
Removing intermediate container 83defd1851d0
Step 5 : CMD python /srv/run.py
---> Running in 11e69b887341
---> 5c0e7e3726d6
Removing intermediate container 11e69b887341
Successfully built 5c0e7e3726d6

I update only run.py and try to build again.

# run.py
print("Hello, Python")

Below is output.

Sending build context to Docker daemon  5.12 kB
Sending build context to Docker daemon 
Step 0 : FROM dockerfile/python
---> f86d6993fc7b
Step 1 : WORKDIR /srv
---> Using cache
---> 55768a00fd94
Step 2 : ADD ./requirements.txt /srv/requirements.txt
---> Using cache
---> 968a7c3a4483
Step 3 : RUN pip install -r requirements.txt
---> Using cache
---> bf5c154b87c9
Step 4 : ADD . /srv
---> 9cc7508034d6
Removing intermediate container 0d7cf71eb05e
Step 5 : CMD python /srv/run.py
---> Running in f25c21135010
---> 4ffab7bc66c7
Removing intermediate container f25c21135010
Successfully built 4ffab7bc66c7

As you can see above, docker use build cache. And I update requirements.txt this time.

# requirements.txt

pytest==2.3.4
ipython

Below is output.

Sending build context to Docker daemon  5.12 kB
Sending build context to Docker daemon 
Step 0 : FROM dockerfile/python
---> f86d6993fc7b
Step 1 : WORKDIR /srv
---> Using cache
---> 55768a00fd94
Step 2 : ADD ./requirements.txt /srv/requirements.txt
---> b6c19f0643b5
Removing intermediate container a4d9cb37dff0
Step 3 : RUN pip install -r requirements.txt
---> Running in 4b7a85a64c33
Downloading/unpacking pytest==2.3.4 (from -r requirements.txt (line 1))
  Running setup.py (path:/tmp/pip_build_root/pytest/setup.py) egg_info for package pytest

Downloading/unpacking ipython (from -r requirements.txt (line 2))
Downloading/unpacking py>=1.4.12 (from pytest==2.3.4->-r requirements.txt (line 1))
  Running setup.py (path:/tmp/pip_build_root/py/setup.py) egg_info for package py

Installing collected packages: pytest, ipython, py
  Running setup.py install for pytest

Installing py.test script to /usr/local/bin
Installing py.test-2.7 script to /usr/local/bin
  Running setup.py install for py

Successfully installed pytest ipython py
Cleaning up...
---> 23a1af3df8ed
Removing intermediate container 4b7a85a64c33
Step 4 : ADD . /srv
---> d8ae270eca35
Removing intermediate container 7f003ebc3179
Step 5 : CMD python /srv/run.py
---> Running in 510359cf9e12
---> e42fc9121a77
Removing intermediate container 510359cf9e12
Successfully built e42fc9121a77

And docker doesn't use build cache. If it doesn't work, check your docker version.

Client version: 1.1.2
Client API version: 1.13
Go version (client): go1.2.1
Git commit (client): d84a070
Server version: 1.1.2
Server API version: 1.13
Go version (server): go1.2.1
Git commit (server): d84a070


回答2:

To minimise the network activity, you could point pip to a cache directory on your host machine.

Run your docker container with your host's pip cache directory bind mounted into your container's pip cache directory. docker run command should look like this:

docker run -v $HOME/.cache/pip/:/root/.cache/pip image_1

Then in your Dockerfile install your requirements as a part of ENTRYPOINT statement (or CMD statement) instead of as a RUN command. This is important, because (as pointed out in comments) the mount is not available during image building (when RUN statements are executed). Docker file should look like this:

FROM my/base

ADD . /srv

ENTRYPOINT ["sh", "-c", "pip install -r requirements.txt && python setup.py install && run_server"]

Probably it's best if the host system's default pip directory will be used as a cache (e.g. $HOME/.cache/pip/ on Linux or $HOME/Library/Caches/pip/ on OSX), just like I suggested in the example docker run command.



回答3:

I found that a better way is to just add the Python site-packages directory as a volume.

services:
    web:
        build: .
        command: python manage.py runserver 0.0.0.0:8000
        volumes:
            - .:/code
            -  /usr/local/lib/python2.7/site-packages/

This way I can just pip install new libraries without having to do a full rebuild.

EDIT: Disregard this answer, jkukul's answer above worked for me. My intent was to cache the site-packages folder. That would have looked something more like:

volumes:
   - .:/code
   - ./cached-packages:/usr/local/lib/python2.7/site-packages/

Caching the download folder is alot cleaner though. That also caches the wheels, so it properly achieves the task.



标签: python docker