I'd like to comprehensively understand the run-time performance cost of a Docker container. I've found references to networking anecdotally being ~100µs slower.
I've also found references to the run-time cost being "negligible" and "close to zero" but I'd like to know more precisely what those costs are. Ideally I'd like to know what Docker is abstracting with a performance cost and things that are abstracted without a performance cost. Networking, CPU, memory, etc.
Furthermore, if there are abstraction costs, are there ways to get around the abstraction cost. For example, perhaps I can mount a disk directly vs. virtually in Docker.
Here is an excellent 2014 IBM research paper titled "An Updated Performance Comparison of Virtual Machines and Linux Containers" by Felter et al. that provides a comparison between bare metal, KVM, and Docker containers. The general result is that Docker is nearly identical to Native performance and faster than KVM in every category.
The exception to this is Docker's NAT - if you use port mapping (e.g. docker run -p 8080:8080
) then you can expect a minor hit in latency, as shown below. However, you can now use the host network stack (e.g. docker run --net=host
) when launching a Docker container, which will perform identically to the Native
column (as shown in the Redis latency results lower down).
They also ran latency tests on a few specific services, such as Redis. You can see that above 20 client threads, highest latency overhead goes Docker NAT, then KVM, then a rough tie between Docker host/native.
Just because it's a really useful paper, here are some other figures. Please download it for full access.
Taking a look at Disk IO:
Now looking at CPU overhead:
Now some examples of memory (read the paper for details, memory can be extra tricky)
Docker isn't virtualization, as such -- instead, it's an abstraction on top of the kernel's support for different process namespaces, device namespaces, etc.; one namespace isn't inherently more expensive or inefficient than another, so what actually makes Docker have a performance impact is a matter of what's actually in those namespaces.
Docker's choices in terms of how it configures namespaces for its containers have costs, but those costs are all directly associated with benefits -- you can give them up, but in doing so you also give up the associated benefit:
- Layered filesystems are expensive -- exactly what the costs are vary with each one (and Docker supports multiple backends), and with your usage patterns (merging multiple large directories, or merging a very deep set of filesystems will be particularly expensive), but they're not free. On the other hand, a great deal of Docker's functionality -- being able to build guests off other guests in a copy-on-write manner, and getting the storage advantages implicit in same -- ride on paying this cost.
- DNAT gets expensive at scale -- but gives you the benefit of being able to configure your guest's networking independently of your host's and have a convenient interface for forwarding only the ports you want between them. You can replace this with a bridge to a physical interface, but again, lose the benefit.
- Being able to run each software stack with its dependencies installed in the most convenient manner -- independent of the host's distro, libc, and other library versions -- is a great benefit, but needing to load shared libraries more than once (when their versions differ) has the cost you'd expect.
And so forth. How much these costs actually impact you in your environment -- with your network access patterns, your memory constraints, etc -- is an item for which it's difficult to provide a generic answer.
Here's some more benchmarks for Docker based memcached server
versus host native memcached server
using Twemperf benchmark tool https://github.com/twitter/twemperf with 5000 connections and 20k connection rate
Connect time overhead for docker based memcached seems to agree with above whitepaper at roughly twice native speed.
Twemperf Docker Memcached
Connection rate: 9817.9 conn/s
Connection time [ms]: avg 341.1 min 73.7 max 396.2 stddev 52.11
Connect time [ms]: avg 55.0 min 1.1 max 103.1 stddev 28.14
Request rate: 83942.7 req/s (0.0 ms/req)
Request size [B]: avg 129.0 min 129.0 max 129.0 stddev 0.00
Response rate: 83942.7 rsp/s (0.0 ms/rsp)
Response size [B]: avg 8.0 min 8.0 max 8.0 stddev 0.00
Response time [ms]: avg 28.6 min 1.2 max 65.0 stddev 0.01
Response time [ms]: p25 24.0 p50 27.0 p75 29.0
Response time [ms]: p95 58.0 p99 62.0 p999 65.0
Twemperf Centmin Mod Memcached
Connection rate: 11419.3 conn/s
Connection time [ms]: avg 200.5 min 0.6 max 263.2 stddev 73.85
Connect time [ms]: avg 26.2 min 0.0 max 53.5 stddev 14.59
Request rate: 114192.6 req/s (0.0 ms/req)
Request size [B]: avg 129.0 min 129.0 max 129.0 stddev 0.00
Response rate: 114192.6 rsp/s (0.0 ms/rsp)
Response size [B]: avg 8.0 min 8.0 max 8.0 stddev 0.00
Response time [ms]: avg 17.4 min 0.0 max 28.8 stddev 0.01
Response time [ms]: p25 12.0 p50 20.0 p75 23.0
Response time [ms]: p95 28.0 p99 28.0 p999 29.0
Here's bencmarks using memtier benchmark tool
memtier_benchmark docker Memcached
4 Threads
50 Connections per thread
10000 Requests per thread
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
------------------------------------------------------------------------
Sets 16821.99 --- --- 1.12600 2271.79
Gets 168035.07 159636.00 8399.07 1.12000 23884.00
Totals 184857.06 159636.00 8399.07 1.12100 26155.79
memtier_benchmark Centmin Mod Memcached
4 Threads
50 Connections per thread
10000 Requests per thread
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
------------------------------------------------------------------------
Sets 28468.13 --- --- 0.62300 3844.59
Gets 284368.51 266547.14 17821.36 0.62200 39964.31
Totals 312836.64 266547.14 17821.36 0.62200 43808.90