Clojure map limits and consistency

2019-06-19 22:57发布

问题:

I would like to know , considering that Clojure uses 32-bit hash for its map implementation, if Clojure map has therefore a limit of 2^32-1 keys (and if this is not true, how it manages collisions) and if its hashing implementation is consistent. TIA!

回答1:

Clojure maps are a custom implementation that is persistent and immutable (i.e. it does not use Java hashmaps, which would not provide sufficient performance when used in an immutable data structure).

It uses 32-bit hash codes, hence 2^32 possible hash buckets. In the case of collisions, keys and values are stored in an array for each hash bucket so it is possible to have more than 2^32 keys. See the PersistentHashMap source - in particular the HashCollisionNode inner class is used to store a bucket of keys / values against a single hashcode value.

Since the number of possible hash buckets is fixed, consistent hashing is irrelevant - the key never need to be remapped.

See also:

  • http://www.infoq.com/presentations/Value-Identity-State-Rich-Hickey (presentation explaining Clojure approach to concurrency but also covers the persistent immutable data structures)