What is special about internal design of LMDB?

2019-05-17 08:52发布

问题:

What would be the performance difference (reads/writes) between some C++ implementation of in-memory B-Tree (for example google btree) and the LMDB (without taking into consideration all the feacures of LMDB such as transactions, isolation, shared access etc.)?

回答1:

This 2014 lmdb design presentation by its architect Howard Chu covers the design and tradeoffs of lmdb.

To summarize: lmdb is a copy-on-write, bottom-up updated, double-buffered, b-tree where the implementation always favors simplicity whenever it clashes with other considerations.

The smart design choices make it one of the highest performance and corruption-resistant B-tree implementations out there.

  • copy-on-write means that data is never overwritten avoiding many possible corruption scenarios
  • bottom-up updates from leaf to root make the root update equivalent to a commit
  • double buffering of past versions keeps only the last-two roots in the db file
  • complex multi-level caching schemes are avoided - lmdb relies on the underlying OS for caching
  • The whole code base is very small compared to other DBs avoiding CPU cache misses

Obviously, these choices mean that lmdb is not friendly to complex scenarios such as:

  • multi-version DB rollbacks (only last 2 are available)
  • long-lived transactions and delayed commits: these lead to append-only behavior and potentially unlimited growth of the db file
  • multiple concurrent writers: lmdb favors simpler multiple readers and single writer schemes

There's much more in the full (over 100 pages) presentation. The above is just a summary of the spirit of lmdb.

lmdb is used as the core storage engine in prominent open source projects such as Open LDAP and Memcached and in both cases speed-ups of orders of magnitude have been observed compared to alternatives as can be seen in micro-benchmark results.



标签: lmdb