Big size of python image in Docker

2020-08-09 07:26发布

问题:

I want to test my app with Docker. So, I have this in Dockerfile:

FROM python:3-onbuild
CMD [ "python", "./test.py" ]

test.py:

print(123)

Then I run:

docker build -t my_test_app .

So, I have one big image. docker images return:

REPOSITORY          TAG                 IMAGE ID        CREATED    VIRTUAL SIZE
python              3-onbuild           b258eb0a5195    8 days ago 757 MB

Why is the file size so large?

Is that file size normal?

回答1:

I just checked on my machine the standard ubuntu:trusty image is 188 MB and the image with all python stuff is 480MB. I see 800MB images quite often, those are usually ones that contain some meaningful application.

However, these images are based on our private images the official Docker library image seems much larger for some reason. They are aware of this fact and are trying to reduce it. Look at the discussion on the subject here

If you need a bit smaller image try this one 'rouge8/pythons' it is about 100MB smaller.

rouge8/pythons latest … 680.3 MB

Keep in mind, docker images are arranged as a hierarchical layer structure. So if you reuse the same underlying base image for many containers the size that each individual container adds is quite small. It will only be the difference between the base plus whatever you added into specific container.



回答2:

You can try the python:{version}-alpine version. It's much smaller:

>> docker image ls |grep python
python    3.6-alpine     89.4 MB
python    3.6            689 MB
python    3.5            689 MB
python    3.5.2          687 MB
python    3.4            833 MB
python    2.7            676 MB

At time of writing it looks like the official image supports -alpine on all python versions.

https://hub.docker.com/_/python/



回答3:

Alpine Linux is a very lean distro avaliable for Docker. Without Python, it's around 5MB. With Python I'm getting images between 60 and 120 MB. The following Dockerfile yields a 110 MB image.

FROM alpine:3.4

RUN apk --update add \
      build-base python-dev \
      ca-certificates python \
      ttf-droid \
      py-pip \
      py-jinja2 \
      py-twisted \
      py-dateutil \
      py-tz \
      py-requests \
      py-pillow \
      py-rrd && \
    pip install --upgrade arrow \
                          pymongo \
                          websocket-client \
                          XlsxWriter && \
    apk del build-base python-dev && \
    rm -rf /var/cache/apk/* && \
    adduser -D -u 1001 noroot

USER noroot

CMD ["/bin/sh"]

Also, it's very well mantained.


A word of warning, though. Alpine uses musl libc instead of glibc, and some Python modules rely on glibc, but this usually isn't a problem.

A bigger issue is, that because of this, manylinux wheels are not avaliable for Alpine, and therefore the modules need to be compiled upon installation (pip install). In some cases this can make a difference in build time between 20 seconds on Debian and 9 minutes or more on Alpine. The grpcio-module is notorious for that; it takes forever to compile.

There is a (somewhat unreliable) workaround where you tell Python that it is manylinux compatible.



回答4:

They add various system packages for things like database clients, image file manipulation and XML parsing libraries. This is so there is no extra work a user has to do if they want to install Python packages for psycopg2, MySQLdb, Pillow or lxml. Adding those extra packages though means that the image will be fatter, which if you didn't need those packages would be a waste of space.

They also don't attempt to trim stuff out of the Python installation which isn't really required, such as all the standard library test code directories. Even the .pyc files can be trimmed to save on space without any real impact as a web application generally loads up once for the life of the container, so having .pyc files doesn't really benefit you much.

As a comparison, have a look at the 'pythonX.Y-slim' variants and the size of those. There isn't though an onbuild variant for the slim images.

You could also look at my own Docker images for Python with bundled Apache/mod_wsgi support. These are trimmed and rely on additional packages being installed by the user as build hooks only if required. For those, the size of the Python 3.4 onbuild image specifically for a WSGI application is:

grahamdumpleton/mod-wsgi-docker python-3.4-onbuild ... 409.9 MB

The size given even includes Apache and mod_wsgi, giving you a proper production grade WSGI server with capabilities to handle static file content and much more.

If not running a WSGI application, start with the base image instead.

You can find the mod_wsgi docker images at:

  • https://registry.hub.docker.com/u/grahamdumpleton/mod-wsgi-docker/

Various blog posts about how to use these images for WSGI applications and constructing Docker images for Python and WSGI applications can be found linked from the image description on Docker hub. Also keep an eye on my blog site in general as I will be posting more about Docker and Python as time goes by.



回答5:

Yes it's normal size. The image contains an operating system image and various packages and that's why the size.



标签: python docker