-->

Hashtable and list side by side?

2020-07-29 23:10发布

问题:

I need a data structure that is ordered but also gives fast random access and inserts and removes. Linkedlists are ordered and fast in inserts and removes but they give slow random access. Hashtables give fast random access but are not ordered.

So, it seems to nice to use both of them together. In my current solution, my Hashtable includes iterators of the list and the List contains the actual items. Nice and effective. Okay, it requires double the memory but that's not an issue.

I have heard that some tree structures could do this also, but are they as fast as this solution?

回答1:

The most efficient tree structure I know is Red Black Tree, and it's not as fast as your solution as it has O(log n) for all operations while your solution has O(1) for some, if not all, operations.

If memory is not an issue and you sure your solution is O(1) meaning the time required to add/delete/find item in the structure is not related to the amount of items you have, go for it.



回答2:

You should consider a Skip List, which is an ordered linked-list with O(log n) access times. In other words, you can enumerate it O(n) and index/insert/delete is O(log n).



回答3:

Trees are made for this. The most appropriate are self-balancing trees like AVL tree or Red Black tree. If you deal with a very big data amounts, it also may be useful to create B-tree (they are used for filesystems, for example).

Concerning your implementation: it may be more or less efficient then trees depending on data amount you work with and HashTable implementation. E.g. some hash tables with a very dense data may give access not in O(1) but in O(log n) or even O(n). Also remember that computing hash from data takes some time too, so for a quit small data amounts absolute time for computing hash may be more then for searching it in a tree.



回答4:

What you did is pretty much the right choice.

The cool thing about this is that adding ordering to an existing map implementation by using a double-ended doubly-linked list doesn't actually change its asymptotic complexity, because all the relevant list operations (appending and deleting) have a worst-case step complexity of Θ(1). (Yes, deletion is Θ(1), too. The reason it is usually Θ(n) is because you have to find the element to delete first, which is Θ(n), but the actual deletion itself is Θ(1). In this particular case, you let the map do the finding, which is something like Θ(1) amortized worst-case step complexity or Θ(logb n) worst-case step complexity, depending on the type of map implementation used.)

The Hash class in Ruby 1.9, for example, is an ordered map, and it is implemented at least in YARV and Rubinius as a hash table embedded into a linked list.

Trees generally have a worst-case step complexity of Θ(logb n) for random access, whereas hash tables may be worse in the worst case (Θ(n)), but usually amortize to Θ(1), provided you don't screw up the hash function or the resize function.

[Note: I'm deliberately only talking about asymptotic behavior here, aka "infinitely large" collections. If your collections are small, then just choose the one with the smallest constant factors.]



回答5:

Java actually contains a LinkedHashTable, which is similar to the data-structure you're describing. It can be surprisingly useful at times.

Tree structures could work as well, seeing they can perform random access (and most other operations) in (O log n) time. Not as fast as Hashtables (O 1), but still fast unless your database is very large.

The only real advantage of trees is that you don't need to decide on the capacity beforehand. Some HashTable implementations can grow their capacity as needed, but simply do so by copying all items into a new, larger hashtable when they've exceeded their capacity, which is very slow. (O n)