This may appear to be a duplicate of this question, which asks "What’s the difference between SortedList and SortedDictionary?" Unfortunately, the answers do nothing more than quote the MSDN documentation (which clearly states that there are performance and memory use differences between the two) but don't actually answer the question.
In fact (and so this question doesn't get the same answers), according to MSDN:
The
SortedList<TKey, TValue>
generic class is a binary search tree with O(log n) retrieval, where n is the number of elements in the dictionary. In this, it is similar to theSortedDictionary<TKey, TValue>
generic class. The two classes have similar object models, and both have O(log n) retrieval. Where the two classes differ is in memory use and speed of insertion and removal:
SortedList<TKey, TValue>
uses less memory thanSortedDictionary<TKey, TValue>
.
SortedDictionary<TKey, TValue>
has faster insertion and removal operations for unsorted data, O(log n) as opposed to O(n) forSortedList<TKey, TValue>
.If the list is populated all at once from sorted data,
SortedList<TKey, TValue>
is faster thanSortedDictionary<TKey, TValue>
.
So, clearly this would indicated that SortedList<TKey, TValue>
is the better choice unless you need faster insert and remove operations for unsorted data.
The question still remains, given the information above what are the practical (real-world, business case, etc.) reasons for using a SortedDictionary<TKey, TValue>
? Based on the performance information, it would imply that there really is no need to have SortedDictionary<TKey, TValue>
at all.
Visual representation of performance differences.
That's all there is to it. Retrieval of keys is comparable, but addition is much faster with Dictionaries.
I try to use SortedList as much as possible because it allows me to iterate over the keys and value collections. This is not possible with SortedDictionary as far as I know.
I'm not sure about this, but as far as I know Dictionaries store data in Tree structures, whereas List store data in linear arrays. That explains why insertion and removal is much faster with dictionaries, since less memory has to be shifted around. It also explains why you can iterate over SortedLists but not SortedDictionary.
I don't know why MSDN says that
SortedList<TKey, TValue>
use a binary tree for its implementation because if you look at code with a decompiler likeReflector
you realize its not true.SortedList<TKey, TValue>
is simply an array that grows over the time.Every time you insert an element, it first check if the array has enough capacity, if not, a bigger array is recreated and old elements are copied into it (like
List<T>
)After that, it searches where to insert the element, using a binary search (this is possible since the array is indexable and already sorted).
To keep the array sorted, it moves (or pushes) all the elements situated after position of element to be inserted by one position (using
Array.Copy()
).Eg :
That explains why performance of
SortedList
is so bad when you insert unsorted elements. It has to re-copy some elements almost every insertion. The only case it has not to be done is when the element has to be inserted at the end of the array.SortedDictionary<TKey, TValue>
is different and use a binary tree to insert and retrieve elements. It also has some cost at insert because sometimes the tree need to be re-balanced (but not every insertion).Performance is quite similar while searching an element with
SortedList
orSortedDictionary
because they both use a binary search.In my opinion, you should never use
SortedList
to just sort an array. Unless you have very few elements, it will always be faster to insert values into a list (or array) and then callSort()
method.SortedList
is mostly useful when you have a list of values already sorted (eg: from database), you want to keep it sorted and perform some operations that would take advantage it is sorted (eg:Contains()
method ofSortedList
performs a binary search instead of linear search)SortedDictionary
offers same advantages thanSortedList
but performs better if values to insert are not already sorted.EDIT : If you are using .NET Framework 4.5, an alternative to
SortedDictionary<TKey, TValue>
isSortedSet<T>
. It works the same way asSortedDictionary
, using a binary tree, but keys and values are the same here.An important consideration for us is the fact that we often have small dictionaries (<100 elements), and current processessors much faster at accessing sequential memory while performing few difficult to predict branches. (i.e. iterating over a linear array rather than traversing a tree) So when you have less than about 60 elements in your dictionary, SortedList<> is often the fastest and most memory efficient dictionary in many use cases.
Are they meant for two different purposes?
There is not much semantic difference these two collection types in .NET make. They both offer keyed lookup as well as keep the entries in sort order of keys. In most cases you will be ok with either of them. Perhaps the only differentiator would be the indexed retrieval
SortedList
permits.But performance?
However there is a performance difference which might be a stronger factor to choose between them. Here is a tabular view of their asymptotic complexity.
Summary
To roughly summarize, you want a
SortedList<K, V>
when:You would instead want to prefer a
SortedDictionary<K, V>
when:Writing code
Both
SortedList<K, V>
andSortedDictionary<K, V>
implementIDictionary<K, V>
, so in your code you can returnIDictionary<K, V>
from the method or declare variable asIDictionary<K, V>
. Basically hide the implementation detail, and code against interface.In future, its easier to switch from either in case you're not happy with performance characteristic of one collection.
For more info on the two collection types see the original question linked.
I'm not sure how accurate the MSDN documentation is on
SortedList
andSortedDictionary
. It seems to be saying both are implemented using a binary search tree. But if the SortedList uses a binary search tree, why would it be much slower on additions thanSortedDictionary
?Anyway, here are some performance test results.
Each test operates on a
SortedList
/SortedDictionary
containing 10,000 int32 keys. Each test is repeated 1.000 times (Release build, Start without Debugging).The first group of tests add keys in sequence from 0 to 9,999. The second group of tests add random shuffled keys between 0 to 9,999 (every number is added exactly once).
As with any profiling, the important thing is the relative performance, not the actual numbers.
As you can see, on sorted data the sorted list is faster than the
SortedDictionary
. On unsorted data theSortedList
is slightly quicker on retrieval, but about 9 times slower on adding.If both are using binary trees internally, it is quite surprising that the Add operation on unsorted data is so much slower for
SortedList
. It is possible that sorted list may also be adding items to a sorted linear data structure at the same time, which would slow it down.However, you would expect the memory usage of a
SortedList
to be equal or greater than or at least equal to aSortedDictionary
. But this contradicts what the MSDN documentation says.