Disk based HashMap

2019-01-07 15:46发布

问题:

Does Java have (or is there a library available) that allows me to have a disk based HashMap? It doesn't need to be atomic or anything, but it will be accessed via multiple threads and shouldn't crash if two are accessing the same element at the same time.

Anyone know of anything?

回答1:

Either properties files or Berkeley DB might be what you're looking for. The java.util.Properties itself implements java.util.Map and provides methods to load from and store to a file. The Berkeley DB is often been recommended as a lightweight key-value pair datastore.



回答2:

MapDB

MapDB provides concurrent TreeMap and HashMap backed by disk storage or off-heap-memory. It is a fast, scalable and easy to use embedded Java database engine. It is packed with features such as transactions, space efficient serialization, instance cache and transparent compression/encryption. It also has outstanding performance rivaled only by native embedded db engines.

http://www.mapdb.org/

jdbm2

Embedded Key Value Java database.

https://code.google.com/p/jdbm2/



回答3:

Sounds like you need something close to a lightweight db. Have you looked at/considered Java DB? A light db with a single, indexed table would basically be a disk-based, thread-safe hash map.



回答4:

JDBM2 is exactly what you are asking. It provides a HashMap backed up by disk storage (among other maps). Its fast, thread-safe and the API is really simple.



回答5:

Project Voldemort is also a really fast/scalable/replication "Hashmap". It is used at LinkedIn an performance is also pretty good:

A quote from their site:

Here is the throughput we see from a single multithreaded client talking to a single server where the "hot" data set is in memory under artificially heavy load in our performance lab:

Reads: 19,384 req/sec
Writes: 16,559 req/sec



回答6:

Chronicle Map implements ConcurrentMap and persists data to disk via mapping it's memory to a file.

Chronicle Map is conceptually very similar to MapDB (provides similar builder API and Map interface), but Chronicle Map is times faster than MapDB and has much better concurrency (Chronicle Map uses highly striped multi-level spin locks).



回答7:

So the year is now 2016. And if anyone's looking to tackle this problem, I found out that the low level environments API in Xodus from JetBrains works for this same purpose, using their computeInTransaction store lambdas.

Granted, it's not as slick as having a pure Map instance, but it worked for my use case.

Another recent option is to use H2's MVStore storage engine which does the same thing, but I think it's more tailored towards the database itself.

Cheers!



回答8:

In 2018 the lightest persistent key value store is the H2 Database with it's MVStore:

The MVStore is a persistent, log structured key-value store. It is planned to be the next storage subsystem of H2, but it can also be used directly within an application, without using JDBC or SQL.

  • MVStore stands for "multi-version store".

  • Each store contains a number of maps that can be accessed using the java.util.Map interface.

  • Both file-based persistence and in-memory operation are supported.

  • It is intended to be fast, simple to use, and small.

  • Concurrent read and write operations are supported.

  • Transactions are supported (including concurrent transactions and 2-phase commit).

  • The tool is very modular. It supports pluggable data types and serialization, pluggable storage (to a file, to off-heap memory), pluggable map implementations (B-tree, R-tree, concurrent B-tree currently), BLOB storage, and a file system abstraction to support encrypted files and zip files.

H2 is also contained in a single library of 1.8 meg

I also looked at:

  • MapDB (13 meg dependencies)
  • chronicle-map (5.5 meg dependencies - fast optionally distributed)
  • lmdbjava (2 meg java dependencies + lmdb C library) - fastest implementation but not thread safe out of the box.