I want to test my app with Docker. So, I have this in Dockerfile:
FROM python:3-onbuild
CMD [ "python", "./test.py" ]
test.py:
print(123)
Then I run:
docker build -t my_test_app .
So, I have one big image. docker images
return:
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
python 3-onbuild b258eb0a5195 8 days ago 757 MB
Why is the file size so large?
Is that file size normal?
I just checked on my machine the standard ubuntu:trusty image is 188 MB and the image with all python stuff is 480MB. I see 800MB images quite often, those are usually ones that contain some meaningful application.
However, these images are based on our private images the official Docker library image seems much larger for some reason. They are aware of this fact and are trying to reduce it. Look at the discussion on the subject here
If you need a bit smaller image try this one 'rouge8/pythons' it is about 100MB smaller.
rouge8/pythons latest … 680.3 MB
Keep in mind, docker images are arranged as a hierarchical layer structure. So if you reuse the same underlying base image for many containers the size that each individual container adds is quite small. It will only be the difference between the base plus whatever you added into specific container.
You can try the python:{version}-alpine version. It's much smaller:
>> docker image ls |grep python
python 3.6-alpine 89.4 MB
python 3.6 689 MB
python 3.5 689 MB
python 3.5.2 687 MB
python 3.4 833 MB
python 2.7 676 MB
At time of writing it looks like the official image supports -alpine
on all python versions.
https://hub.docker.com/_/python/
Alpine Linux is a very lean distro avaliable for Docker. Without Python, it's around 5MB. With Python I'm getting images between 60 and 120 MB. The following Dockerfile yields a 110 MB image.
FROM alpine:3.4
RUN apk --update add \
build-base python-dev \
ca-certificates python \
ttf-droid \
py-pip \
py-jinja2 \
py-twisted \
py-dateutil \
py-tz \
py-requests \
py-pillow \
py-rrd && \
pip install --upgrade arrow \
pymongo \
websocket-client \
XlsxWriter && \
apk del build-base python-dev && \
rm -rf /var/cache/apk/* && \
adduser -D -u 1001 noroot
USER noroot
CMD ["/bin/sh"]
Also, it's very well mantained.
A word of warning, though. Alpine uses musl libc instead of glibc, and some Python modules rely on glibc, but this usually isn't a problem.
A bigger issue is, that because of this, manylinux wheels are not avaliable for Alpine, and therefore the modules need to be compiled upon installation (pip install). In some cases this can make a difference in build time between 20 seconds on Debian and 9 minutes or more on Alpine. The grpcio
-module is notorious for that; it takes forever to compile.
There is a (somewhat unreliable) workaround where you tell Python that it is manylinux compatible.
They add various system packages for things like database clients, image file manipulation and XML parsing libraries. This is so there is no extra work a user has to do if they want to install Python packages for psycopg2, MySQLdb, Pillow or lxml. Adding those extra packages though means that the image will be fatter, which if you didn't need those packages would be a waste of space.
They also don't attempt to trim stuff out of the Python installation which isn't really required, such as all the standard library test code directories. Even the .pyc files can be trimmed to save on space without any real impact as a web application generally loads up once for the life of the container, so having .pyc files doesn't really benefit you much.
As a comparison, have a look at the 'pythonX.Y-slim' variants and the size of those. There isn't though an onbuild variant for the slim images.
You could also look at my own Docker images for Python with bundled Apache/mod_wsgi support. These are trimmed and rely on additional packages being installed by the user as build hooks only if required. For those, the size of the Python 3.4 onbuild image specifically for a WSGI application is:
grahamdumpleton/mod-wsgi-docker python-3.4-onbuild ... 409.9 MB
The size given even includes Apache and mod_wsgi, giving you a proper production grade WSGI server with capabilities to handle static file content and much more.
If not running a WSGI application, start with the base image instead.
You can find the mod_wsgi docker images at:
- https://registry.hub.docker.com/u/grahamdumpleton/mod-wsgi-docker/
Various blog posts about how to use these images for WSGI applications and constructing Docker images for Python and WSGI applications can be found linked from the image description on Docker hub. Also keep an eye on my blog site in general as I will be posting more about Docker and Python as time goes by.
Yes it's normal size. The image contains an operating system image and various packages and that's why the size.