Why does MongoDB perform better with multi-threade

2019-03-27 21:23发布

问题:

We recently benchmarked Oracle 10g and MongoDB with YCSB ( https://github.com/brianfrankcooper/YCSB/wiki ), when we tried to increase the number of threads for 1,000,000 datasets, Oracle's performance remained constant after 4 threads however MongoDB kept on performing better and better till 8 threads and after that only reads were better, writes and updates (operations/sec) remained constant.

We ran this benchmark on 2 CPU Xeon quad core (total 8 cores) + 8 GB RAM on LAN.

What we observed was that MongoDB performed better with multi-threaded client comparing to single-threaded client, my question is: when MongoDB can perform better with more load why can't it do the same with less load (say just a couple of threads) by utilizing the multiple cores?

回答1:

It is logically very simple to process a request on a single core. Just have code that receives the request, and deals with it.

It is not nearly so simple to process a single request on 2 cores, because doing so requires you to break up the request into components, farm out the work, synchronize the answers, and then build up a single response. And if you do this work, while you can reduce wallclock time (how much time the clock on the wall sees pass), you're invariably going to make the request take more CPU time (total CPU resources consumed).

In a system like MongoDB where you expect to have a lot of different clients making requests, there is no need to try to parallelize the handling of a single request, and every reason not to.

The bigger question is why Oracle didn't increase concurrency after 4 CPUs. There are any number of possible reasons, but one reasonable guess is that you encountered some sort of locking which is needed to guarantee consistency. (MongoDB does not offer you consistency, and so avoids this type of bottleneck.)



回答2:

Oracle doesn't lock data for consistency but it does write data to redo and undo files for transactions and read consistency. Oracle is a MVCC system. See http://en.wikipedia.org/wiki/Multiversion_concurrency_control .

You have to use parameterized queries to make Oracle fast, else Oracle will spend too much time parsing queries. This is especially important when a lot of small queries run simultaneously, the situation you are testing.

MongoDB does lock on writes.

edit 1:

Another big difference between Oracle and MongoDB is durability. MongoDB doesn't offer durability if you use the default configuration. It writes once every minute data to the disk. Oracle writes to disk with every commit. So Oracle does a lot more fsyncing.