Databases are designed to consume all memory, CPU and IO available to them. Is there are good/bad reasons Docker should not be used for databases in production?
May be this question applies to other tools like MOMs Apache Kafka, Apache ActiveMQ etc.
Docker is not a full-scale virtual machine (at least when run on Linux), this is just another process, running on the same kernel, as host machine. Moreover, all
docker
container processes can be seen in a host machine withps aux
.The only additional load
Docker
gives is loading another OS on top of your kernel, but actually most containers are deployed with extremely lightweight stuff likealpine
Linux, so I dont think it really has to be taken into consideration.From another point, having Database (or any other high load service) in a containers gives you following advantages:
k8s
)So deploying containerized services today is a right practice.
Containers are designed to be able to regulate resource usage through the use of cgroups, and so as long as we are able to predict usage, we should have no issue (with performance) running it in a container. There are other considerations besides the resource usage, however.
In an architecture like Kubernetes, it becomes more complex to manage database deployments, in part because containers are now ephemeral. If a pod goes down on a given node, there is no guarantee it will be brought back up on that same node, and so special considerations need to be made for stateful applications (pod must be mounted to the same volume on relaunch, etc). This is where constructs like StatefulSets come in. So, it works, and the solutions and very well thought out, but there are a few more operational hoops to jump through.
There are also things like Operators that can handle the complex needs of bringing up and managing a stateful application like a database or distributed message queue. These projects can be quite green at times, but there's a lot of behavior that would be tough to orchestrate on bare metal that we get right out of the box.
More or less, at the end of the day, running stateful applications like databases or message queues on Kubernetes (or other container orchestrators) is a contentious topic. Like all design decisions, there are tradeoffs with resilience, complexity, and debuggability. Lots of large companies are doing it in production, so it's by no means unreasonable.