Are there any resources about the asymptotic complexity (big-O and the rest) of methods of .NET collection classes (Dictionary<K,V>
, List<T>
etc...)?
I know that the C5 library's documentation includes some information about it (example), but I'm interested in standard .NET collections too... (and PowerCollections' information would also be nice).
MSDN Lists these:
Dictionary<,>
List<>
SortedList<,>
(edit: wrong link; here's the generic version)
SortedDictionary<,>
etc. For example:
The SortedList(TKey, TValue) generic
class is a binary search tree with
O(log n) retrieval, where n is the
number of elements in the dictionary.
In this, it is similar to the
SortedDictionary(TKey, TValue) generic
class. The two classes have similar
object models, and both have O(log n)
retrieval. Where the two classes
differ is in memory use and speed of
insertion and removal:
SortedList(TKey, TValue) uses less
memory than SortedDictionary(TKey,
TValue).
SortedDictionary(TKey, TValue) has
faster insertion and removal
operations for unsorted data, O(log n)
as opposed to O(n) for
SortedList(TKey, TValue).
If the list is populated all at once
from sorted data, SortedList(TKey,
TValue) is faster than
SortedDictionary(TKey, TValue).
This page summarises some of the time comlplexities for various collection types with Java, though they should be exactly the same for .NET.
I've taken the tables from that page and altered/expanded them for the .NET framework.
See also the MSDN pages for SortedDictionary and SortedList, which detail the time complexities required for various operations.
Searching
Type of Search/Collection Types Complexity Comments
Linear search Array/ArrayList/LinkedList O(N) Unsorted data.
Binary search sorted Array/ArrayList/ O(log N) Requires sorted data.
Search Hashtable/Dictionary<T> O(1) Uses hash function.
Binary search SortedDictionary/SortedKey O(log N) Sorting is automated.
Retrieval and Insertion
Operation Array/ArrayList LinkedList SortedDictionary SortedList
Access back O(1) O(1) O(log N) O(log N)
Access front O(1) O(1) N.A. N.A.
Access middle O(1) O(N) N.A. N.A.
Insert at back O(1) O(1) O(log N) O(N)
Insert at front O(N) O(1) N.A. N.A.
Insert in middle O(N) O(1) N.A. N.A.
Deletion should have the same complexity as insertion for the associated collection.
SortedList has a few notable peculiarities for insertion and retrieval.
Insertion (Add method):
This method is an O(n) operation for
unsorted data, where n is Count. It is
an O(log n) operation if the new
element is added at the end of the
list. If insertion causes a resize,
the operation is O(n).
Retrieval (Item property):
Retrieving the value of this property
is an O(log n) operation, where n is
Count. Setting the property is an
O(log n) operation if the key is
already in the SortedList<(Of <(TKey,
TValue>)>). If the key is not in the
list, setting the property is an O(n)
operation for unsorted data, or O(log
n) if the new element is added at the
end of the list. If insertion causes a
resize, the operation is O(n).
Note that ArrayList
is equivalent to List<T>
in terms of the complexity of all operations.
I don't know in general (the other answer just posted perhaps gives you exactly what you're after) - but you can reflect this and other methods of course using ILSpy (a little awkward with FSharp code, true) and this eventually yields this function as C#:
internal static a maximumElementAux<a>(SetTree<a> s, a n)
{
while (true)
{
SetTree<a> setTree = s;
if (setTree is SetTree<a>.SetOne)
{
break;
}
if (setTree == null)
{
return n;
}
SetTree<a>.SetNode setNode = (SetTree<a>.SetNode)s;
SetTree<a> arg_23_0 = setNode.item3;
n = setNode.item1;
s = arg_23_0;
}
return ((SetTree<a>.SetOne)s).item;
return n;
}
Okay so this is not exactly 'proper' code in C# terms - but the presence of the while(true)
loop implies it can't be O(1) at least; as for what it actually is... well, my head hurts too much to find out :)
This page presents short notes about some key pros & cons for most .NET Collections :
http://geekswithblogs.net/BlackRabbitCoder/archive/2011/06/16/c.net-fundamentals-choosing-the-right-collection-class.aspx
There's no such thing as "complexity of collection classes". Rather, different operations on these collections have different complexities. For instance, adding an element to a Dictionary<K, V>
...
...approaches an O(1) operation. If the capacity must be increased to accommodate the new element, this method becomes an O(n) operation, where n
is Count
.
Whereas retrieving an element from a Dictionary<K, V>
...
...approaches an O(1) operation.
The documentation says it is build on a binary tree, and does not mention tracking the maximum element. If the documentation is correct, that means it should be O( log n). There used to be at least one mistake in the collections documentation (referring to an array-backed data structure as a binary search tree), but that has been corrected.