I've been reading about Skip Lists lately.
I have a web application that executes quite complex Sql queries against static datasets.
I want to implement a caching system whereby I generate an md5 hash of the sql query and then return a cached dataset for the query if it exists in the collection.
Which algorithm would be better, Dictionary or a SkipList? Why?
http://msdn.microsoft.com/en-us/library/ms379573%28VS.80%29.aspx#datastructures20_4_topic4
Dictionary
, definitely. Two reasons:
Dictionary<TKey, TValue>
uses a hash table, making retrieval O(1) (i.e. constant time), compared to O(log n) in a skip list.
Dictionary<TKey, TValue>
already exists and is well-tested and optimised, whereas a skip list class doesn’t exist to my knowledge, so you would have to implement your own, which takes effort to get it right and test it thoroughly.
Memory consumption is about the same for both (certainly the same complexity, namely O(n)).
The reason you would use a SkipList<T>
vs Dictionary<TKey,TValue>
is that a skip list keeps its items in order. If you regularly need to enumerate the items in order, a skip list is good because it can enumerate in O(n).
If you wanted to be able to enumerate in order but didn't care if enumeration is O(n lg n), a SortedSet<T>
(or more likely a SortedDictionary<TKey, TValue>
) would be what you'd want because they use red-black trees (balanced binary trees) and they are already in the standard library.
Since it's extremely unlikely that you will want to enumerate your cache in order (or at all), a skip list (and likewise a binary tree) is unnecessary.
Skip List gives an average of Log(n) on all dictionary operations. If the number of items is fixed then a lock stripped hash table will do great. An in memory splay tree is also good since cache is the word. Splay trees give faster for the recently accessed item. As such in a dictionary operation such as find; [skip lists were slow compared to splay tree which again were slow compared to hash tables.][1][1]: http://harisankar-krishnaswamy.blogspot.in/2012/04/skip-list-runtime-on-dictionay.html
If localization in data structure is needed then skip lists can be useful. For example, finding flights around a date etc. But, a cache is in memory so a splay is fine. Hashtable and splay trees dont provide localization.