What's the best way to merge 2 or more dictionaries (Dictionary<T1,T2>
) in C#?
(3.0 features like LINQ are fine).
I'm thinking of a method signature along the lines of:
public static Dictionary<TKey,TValue>
Merge<TKey,TValue>(Dictionary<TKey,TValue>[] dictionaries);
or
public static Dictionary<TKey,TValue>
Merge<TKey,TValue>(IEnumerable<Dictionary<TKey,TValue>> dictionaries);
EDIT: Got a cool solution from JaredPar and Jon Skeet, but I was thinking of something that handles duplicate keys. In case of collision, it doesn't matter which value is saved to the dict as long as it's consistent.
Considering the performance of dictionary key lookups and deletes since they are hash operations, and considering the wording of the question was best way, I think that below is a perfectly valid approach, and the others are a bit over-complicated, IMHO.
OR if you're working in a multithreaded application and your dictionary needs to be thread safe anyway, you should be doing this:
You could then wrap this to make it handle an enumeration of dictionaries. Regardless, you're looking at about ~O(3n) (all conditions being perfect), since the
.Add()
will do an additional, unnecessary but practically free,Contains()
behind the scenes. I don't think it gets much better.If you wanted to limit extra operations on large collections, you should sum up the
Count
of each dictionary you're about to merge and set the capacity of the the target dictionary to that, which avoids the later cost of resizing. So, end product is something like this...Note that I took in an
IList<T>
to this method... mostly because if you take in anIEnumerable<T>
, you've opened yourself up to multiple enumerations of the same set, which can be very costly if you got your collection of dictionaries from a deferred LINQ statement.This partly depends on what you want to happen if you run into duplicates. For instance, you could do:
That will blow up if you get any duplicate keys.
EDIT: If you use ToLookup then you'll get a lookup which can have multiple values per key. You could then convert that to a dictionary:
It's a bit ugly - and inefficient - but it's the quickest way to do it in terms of code. (I haven't tested it, admittedly.)
You could write your own ToDictionary2 extension method of course (with a better name, but I don't have time to think of one now) - it's not terribly hard to do, just overwriting (or ignoring) duplicate keys. The important bit (to my mind) is using SelectMany, and realising that a dictionary supports iteration over its key/value pairs.
The following works for me. If there are duplicates, it will use dictA's value.
Merging using an
EqualityComparer
that maps items for comparison to a different value/type. Here we will map fromKeyValuePair
(item type when enumerating a dictionary) toKey
.Usage:
I would do it like this:
Simple and easy. According to this blog post it's even faster than most loops as its underlying implementation accesses elements by index rather than enumerator (see this answer).
It will of course throw an exception if there are duplicates, so you'll have to check before merging.