What would be the performance difference (reads/writes) between some C++ implementation of in-memory B-Tree (for example google btree) and the LMDB (without taking into consideration all the feacures of LMDB such as transactions, isolation, shared access etc.)?
问题:
回答1:
This 2014 lmdb design presentation by its architect Howard Chu covers the design and tradeoffs of lmdb
.
To summarize: lmdb
is a copy-on-write, bottom-up updated, double-buffered, b-tree where the implementation always favors simplicity whenever it clashes with other considerations.
The smart design choices make it one of the highest performance and corruption-resistant B-tree implementations out there.
- copy-on-write means that data is never overwritten avoiding many possible corruption scenarios
- bottom-up updates from leaf to root make the root update equivalent to a commit
- double buffering of past versions keeps only the last-two roots in the db file
- complex multi-level caching schemes are avoided -
lmdb
relies on the underlying OS for caching - The whole code base is very small compared to other DBs avoiding CPU cache misses
Obviously, these choices mean that lmdb
is not friendly to complex scenarios such as:
- multi-version DB rollbacks (only last 2 are available)
- long-lived transactions and delayed commits: these lead to append-only behavior and potentially unlimited growth of the db file
- multiple concurrent writers:
lmdb
favors simpler multiple readers and single writer schemes
There's much more in the full (over 100 pages) presentation. The above is just a summary of the spirit of lmdb
.
lmdb
is used as the core storage engine in prominent open source projects such as Open LDAP and Memcached and in both cases speed-ups of orders of magnitude have been observed compared to alternatives as can be seen in micro-benchmark results.