How is Docker different from a virtual machine? [c

2018-12-31 04:45发布

I keep rereading the Docker documentation to try to understand the difference between Docker and a full VM. How does it manage to provide a full filesystem, isolated networking environment, etc. without being as heavy?

Why is deploying software to a Docker image (if that's the right term) easier than simply deploying to a consistent production environment?

19条回答
牵手、夕阳
2楼-- · 2018-12-31 05:24

Docker originally used LinuX Containers (LXC), but later switched to runC (formerly known as libcontainer), which runs in the same operating system as its host. This allows it to share a lot of the host operating system resources. Also, it uses a layered filesystem (AuFS) and manages networking.

AuFS is a layered file system, so you can have a read only part and a write part which are merged together. One could have the common parts of the operating system as read only (and shared amongst all of your containers) and then give each container its own mount for writing.

So, let's say you have a 1 GB container image; if you wanted to use a full VM, you would need to have 1 GB times x number of VMs you want. With Docker and AuFS you can share the bulk of the 1 GB between all the containers and if you have 1000 containers you still might only have a little over 1 GB of space for the containers OS (assuming they are all running the same OS image).

A full virtualized system gets its own set of resources allocated to it, and does minimal sharing. You get more isolation, but it is much heavier (requires more resources). With Docker you get less isolation, but the containers are lightweight (require fewer resources). So you could easily run thousands of containers on a host, and it won't even blink. Try doing that with Xen, and unless you have a really big host, I don't think it is possible.

A full virtualized system usually takes minutes to start, whereas Docker/LXC/runC containers take seconds, and often even less than a second.

There are pros and cons for each type of virtualized system. If you want full isolation with guaranteed resources, a full VM is the way to go. If you just want to isolate processes from each other and want to run a ton of them on a reasonably sized host, then Docker/LXC/runC seems to be the way to go.

For more information, check out this set of blog posts which do a good job of explaining how LXC works.

Why is deploying software to a docker image (if that's the right term) easier than simply deploying to a consistent production environment?

Deploying a consistent production environment is easier said than done. Even if you use tools like Chef and Puppet, there are always OS updates and other things that change between hosts and environments.

Docker gives you the ability to snapshot the OS into a shared image, and makes it easy to deploy on other Docker hosts. Locally, dev, qa, prod, etc.: all the same image. Sure you can do this with other tools, but not nearly as easily or fast.

This is great for testing; let's say you have thousands of tests that need to connect to a database, and each test needs a pristine copy of the database and will make changes to the data. The classic approach to this is to reset the database after every test either with custom code or with tools like Flyway - this can be very time-consuming and means that tests must be run serially. However, with Docker you could create an image of your database and run up one instance per test, and then run all the tests in parallel since you know they will all be running against the same snapshot of the database. Since the tests are running in parallel and in Docker containers they could run all on the same box at the same time and should finish much faster. Try doing that with a full VM.

From comments...

Interesting! I suppose I'm still confused by the notion of "snapshot[ting] the OS". How does one do that without, well, making an image of the OS?

Well, let's see if I can explain. You start with a base image, and then make your changes, and commit those changes using docker, and it creates an image. This image contains only the differences from the base. When you want to run your image, you also need the base, and it layers your image on top of the base using a layered file system: as mentioned above, Docker uses AUFS. AUFS merges the different layers together and you get what you want; you just need to run it. You can keep adding more and more images (layers) and it will continue to only save the diffs. Since Docker typically builds on top of ready-made images from a registry, you rarely have to "snapshot" the whole OS yourself.

查看更多
素衣白纱
3楼-- · 2018-12-31 05:24

With a virtual machine, we have a server, we have a host operating system on that server, and then we have a hypervisor. And then running on top of that hypervisor, we have any number of guest operating systems with an application and its dependent binaries, and libraries on that server. It brings a whole guest operating system with it. It's quite heavyweight. Also there's a limit to how much you can actually put on each physical machine.

Enter image description here

Docker containers on the other hand, are slightly different. We have the server. We have the host operating system. But instead a hypervisor, we have the Docker engine, in this case. In this case, we're not bringing a whole guest operating system with us. We're bringing a very thin layer of the operating system, and the container can talk down into the host OS in order to get to the kernel functionality there. And that allows us to have a very lightweight container.

All it has in there is the application code and any binaries and libraries that it requires. And those binaries and libraries can actually be shared across different containers if you want them to be as well. And what this enables us to do, is a number of things. They have much faster startup time. You can't stand up a single VM in a few seconds like that. And equally, taking them down as quickly.. so we can scale up and down very quickly and we'll look at that later on.

Enter image description here

Every container thinks that it’s running on its own copy of the operating system. It’s got its own file system, own registry, etc. which is a kind of a lie. It’s actually being virtualized.

查看更多
柔情千种
4楼-- · 2018-12-31 05:25

Difference between how apps in VM use cpu vs containers

Source: Kubernetes in Action.

查看更多
情到深处是孤独
5楼-- · 2018-12-31 05:30

I like Ken Cochrane's answer.

But I want to add additional point of view, not covered in detail here. In my opinion Docker differs also in whole process. In contrast to VMs, Docker is not (only) about optimal resource sharing of hardware, moreover it provides a "system" for packaging application (preferable, but not a must, as a set of microservices).

To me it fits in the gap between developer-oriented tools like rpm, Debian packages, Maven, npm + Git on one side and ops tools like Puppet, VMware, Xen, you name it...

Why is deploying software to a docker image (if that's the right term) easier than simply deploying to a consistent production environment?

Your question assumes some consistent production environment. But how to keep it consistent? Consider some amount (>10) of servers and applications, stages in the pipeline.

To keep this in sync you'll start to use something like Puppet, Chef or your own provisioning scripts, unpublished rules and/or lot of documentation... In theory servers can run indefinitely, and be kept completely consistent and up to date. Practice fails to manage a server's configuration completely, so there is considerable scope for configuration drift, and unexpected changes to running servers.

So there is a known pattern to avoid this, the so called immutable server. But the immutable server pattern was not loved. Mostly because of the limitations of VMs that were used before Docker. Dealing with several gigabytes big images, moving those big images around, just to change some fields in the application, was very very laborious. Understandable...

With a Docker ecosystem, you will never need to move around gigabytes on "small changes" (thanks aufs and Registry) and you don't need to worry about losing performance by packaging applications into a Docker container at runtime. You don't need to worry about versions of that image.

And finally you will even often be able to reproduce complex production environments even on your Linux laptop (don't call me if doesn't work in your case ;))

And of course you can start Docker containers in VMs (it's a good idea). Reduce your server provisioning on the VM level. All the above could be managed by Docker.

P.S. Meanwhile Docker uses its own implementation "libcontainer" instead of LXC. But LXC is still usable.

查看更多
人间绝色
6楼-- · 2018-12-31 05:31

Through this post we are going to draw some lines of differences between VMs and LXCs. Let's first define them.

VM:

A virtual machine emulates a physical computing environment, but requests for CPU, memory, hard disk, network and other hardware resources are managed by a virtualization layer which translates these requests to the underlying physical hardware.

In this context the VM is called as the Guest while the environment it runs on is called the host.

LXCs:

Linux Containers (LXC) are operating system-level capabilities that make it possible to run multiple isolated Linux containers, on one control host (the LXC host). Linux Containers serve as a lightweight alternative to VMs as they don’t require the hypervisors viz. Virtualbox, KVM, Xen, etc.

Now unless you were drugged by Alan (Zach Galifianakis- from the Hangover series) and have been in Vegas for the last year, you will be pretty aware about the tremendous spurt of interest for Linux containers technology, and if I will be specific one container project which has created a buzz around the world in last few months is – Docker leading to some echoing opinions that cloud computing environments should abandon virtual machines (VMs) and replace them with containers due to their lower overhead and potentially better performance.

But the big question is, is it feasible?, will it be sensible?

a. LXCs are scoped to an instance of Linux. It might be different flavors of Linux (e.g. a Ubuntu container on a CentOS host but it’s still Linux.) Similarly, Windows-based containers are scoped to an instance of Windows now if we look at VMs they have a pretty broader scope and using the hypervisors you are not limited to operating systems Linux or Windows.

b. LXCs have low overheads and have better performance as compared to VMs. Tools viz. Docker which are built on the shoulders of LXC technology have provided developers with a platform to run their applications and at the same time have empowered operations people with a tool that will allow them to deploy the same container on production servers or data centers. It tries to make the experience between a developer running an application, booting and testing an application and an operations person deploying that application seamless, because this is where all the friction lies in and purpose of DevOps is to break down those silos.

So the best approach is the cloud infrastructure providers should advocate an appropriate use of the VMs and LXC, as they are each suited to handle specific workloads and scenarios.

Abandoning VMs is not practical as of now. So both VMs and LXCs have their own individual existence and importance.

查看更多
无与为乐者.
7楼-- · 2018-12-31 05:33

There are many answers which explain more detailed on the differences, but here is my very brief explanation.

One important difference is that VMs use a separate kernel to run the OS. That's the reason it is heavy and takes time to boot, consuming more system resources.

In Docker, the containers share the kernel with the host; hence it is lightweight and can start and stop quickly.

In Virtualization, the resources are allocated in the beginning of set up and hence the resources are not fully utilized when the virtual machine is idle during many of the times. In Docker, the containers are not allocated with fixed amount of hardware resources and is free to use the resources depending on the requirements and hence it is highly scalable.

Docker uses UNION File system .. Docker uses a copy-on-write technology to reduce the memory space consumed by containers. Read more here

查看更多
登录 后发表回答