Meaning of Open hashing and Closed hashing

2019-01-29 18:36发布

问题:

Open Hashing (Separate Chaining):

In open hashing, keys are stored in linked lists attached to cells of a hash table.

Closed Hashing (Open Addressing):

In closed hashing, all keys are stored in the hash table itself without the use of linked lists.

I am unable to understand why they are called open, closed and Separate. Can some one explain it?

回答1:

The use of "closed" vs. "open" reflects whether or not we are locked in to using a certain position or data structure (this is an extremely vague description, but hopefully the rest helps).

For instance, the "open" in "open addressing" tells us the index (aka. address) at which an object will be stored in the hash table is not completely determined by its hash code. Instead, the index may vary depending on what's already in the hash table.

The "closed" in "closed hashing" refers to the fact that we never leave the hash table; every object is stored directly at an index in the hash table's internal array. Note that this is only possible by using some sort of open addressing strategy. This explains why "closed hashing" and "open addressing" are synonyms.

Contrast this with open hashing - in this strategy, none of the objects are actually stored in the hash table's array; instead once an object is hashed, it is stored in a list which is separate from the hash table's internal array. "open" refers to the freedom we get by leaving the hash table, and using a separate list. By the way, "separate list" hints at why open hashing is also known as "separate chaining".

In short, "closed" always refers to some sort of strict guarantee, like when we guarantee that objects are always stored directly within the hash table (closed hashing). Then, the opposite of "closed" is "open", so if you don't have such guarantees, the strategy is considered "open".



回答2:

The name open addressing refers to the fact that the location ("address") of the element is not determined by its hash value. (This method is also called closed hashing).

In separate chaining, each bucket is independent, and has some sort of ADT (list, binary search trees, etc) of entries with the same index. In a good hash table, each bucket has zero or one entries, because we need operations of order O(1) for insert, search, etc.

This is a example of separate chaining using C++ with a simple hash function using mod operator (clearly, a bad hash function)



回答3:

You have an array that is the "hash table".

In Open Hashing each cell in the array points to a list containg the collisions. The hashing has produced the same index for all items in the linked list.

In Closed Hashing you use only one array for everything. You store the collisions in the same array. The trick is to use some smart way to jump from collision to collision unitl you find what you want. And do this in a reproducible / deterministic way.



标签: hash