Can anyone explain what is the complexity of the following Dictionary
methods?
ContainsKey(key)
Add(key,value);
I'm trying to figure out the complexity of a method I wrote:
public void DistinctWords(String s)
{
Dictionary<string,string> d = new Dictionary<string,string>();
String[] splitted = s.split(" ");
foreach ( String ss in splitted)
{
if (!d.containskey(ss))
d.add(ss,null);
}
}
I assumed that the 2 dictionary methods are of log(n) complexity where n is the number of keys in the dictionary. Is this correct?
This routine, as a whole, is, effectively, O(m) time complexity, with m being the number of strings in your search.
This is because Dictionary.Contains and Dictionary.Add are both (normally) O(1) operations.
(It's slightly more complicated than that, as Dictionary.Add can be O(n) for n items in the Dictionary, but only when the dictionary capacity is small. As such, if you construct your dictionary with a large enough capacity up front, it would be O(m) for m string entries.)
That being said, if you're only using the Dictionary for existence checking, you could use a HashSet<string>
. This would allow you to write:
public void DistinctWords(String s)
{
HashSet<string> hash = new HashSet<string>(s.Split(' '));
// Use hash here...
As your dictionary is a local variable, and not stored (at least in your code), you could also use LINQ:
var distinctWords = s.Split(' ').Distinct();
It's written in the documentation for Dictionary...
The Dictionary generic class provides a mapping from a set of keys to a set of values. Each addition to the dictionary consists of a value and its associated key. Retrieving a value by using its key is very fast, close to O(1), because the Dictionary class is implemented as a hash table.
And for the Add function:
If Count is less than the capacity, this method approaches an O(1) operation. If the capacity must be increased to accommodate the new element, this method becomes an O(n) operation, where n is Count.
Both methods have constant complexity:
- ContainsKey(key) - O(1)
- Add(key,value) - O(1)
That is not correct, generally dictionaries / hashtable lookup is O(1). To do this it will generate a hash from the key your are looking for and only compare it to items that have the same hash - with a good hashing algorithm this is considered O(1) overall (amortized O(1) - only in the rare occasion that the capacity must be increased for an addition you have O(n)).
Both are constant time:
http://msdn.microsoft.com/en-us/library/kw5aaea4.aspx
http://msdn.microsoft.com/en-us/library/k7z0zy8k.aspx
One caveat however:
"If Count is less than the capacity, this method approaches an O(1) operation. If the capacity must be increased to accommodate the new element, this method becomes an O(n) operation, where n is Count."
The ContainsKey and Add method are close to O(1).
ContainsKey documentation:
This method approaches an O(1) operation.
Add documentation:
If Count is less than the capacity, this method approaches an O(1)
operation. If the capacity must be increased to accommodate the new
element, this method becomes an O(n) operation, where n is Count.
If you are using Framework 3.5 or later, you can use a HashSet<T>
instead of a dictionary with dummy values:
public void DistinctWords(String s) {
HashSet<string> d = new HashSet<string(s.split(" "));
}