I've recently learned a little bit about hash values, and therefore also heard of about the problem of hash collisions.
I therefore wondered: How does one deal with those?
E.g. Swift's Dictonary
uses hash values with its keys. I assume that it looks up its values via the hash. So how would Swift's Dictionary
then store values for different keys, that happen to have the same hash?
Fundamentally, there are two major ways of handling hash collisions - separate chaining, when items with colliding hash codes are stored in a separate data structure, and open addressing, when colliding data is stored in another available bucket that was selected using some algorithm.
Both strategies have numerous sub-strategies, described in Wikipedia. The exact strategy used by a particular implementation is, not surprisingly, implementation-specific, so the authors can change it at any time for something more efficient without breaking the assumptions of their users.
A this point, the only way to find out how Swift handles collisions would be disassembling the library (that is, unless you work for Apple, and have access to the source code). Curious people did that to
NSDictionary
, and determined that it uses linear probing, the simplest variation of the open addressing technique.Swift dictionaries uses open addressing and linear probing.
Here is a link to the actual source documentation explaining everything: https://github.com/apple/swift/blob/master/stdlib/public/core/HashedCollections.swift.gyb
There are two basic techniques:
Or both.