Merging dictionaries in C#

2018-12-31 14:12发布

What's the best way to merge 2 or more dictionaries (Dictionary<T1,T2>) in C#? (3.0 features like LINQ are fine).

I'm thinking of a method signature along the lines of:

public static Dictionary<TKey,TValue>
                 Merge<TKey,TValue>(Dictionary<TKey,TValue>[] dictionaries);

or

public static Dictionary<TKey,TValue>
                 Merge<TKey,TValue>(IEnumerable<Dictionary<TKey,TValue>> dictionaries);

EDIT: Got a cool solution from JaredPar and Jon Skeet, but I was thinking of something that handles duplicate keys. In case of collision, it doesn't matter which value is saved to the dict as long as it's consistent.

20条回答
看风景的人
2楼-- · 2018-12-31 14:51

Considering the performance of dictionary key lookups and deletes since they are hash operations, and considering the wording of the question was best way, I think that below is a perfectly valid approach, and the others are a bit over-complicated, IMHO.

    public static void MergeOverwrite<T1, T2>(this IDictionary<T1, T2> dictionary, IDictionary<T1, T2> newElements)
    {
        if (newElements == null) return;

        foreach (var e in newElements)
        {
            dictionary.Remove(e.Key); //or if you don't want to overwrite do (if !.Contains()
            dictionary.Add(e);
        }
    }

OR if you're working in a multithreaded application and your dictionary needs to be thread safe anyway, you should be doing this:

    public static void MergeOverwrite<T1, T2>(this ConcurrentDictionary<T1, T2> dictionary, IDictionary<T1, T2> newElements)
    {
        if (newElements == null || newElements.Count == 0) return;

        foreach (var ne in newElements)
        {
            dictionary.AddOrUpdate(ne.Key, ne.Value, (key, value) => value);
        }
    }

You could then wrap this to make it handle an enumeration of dictionaries. Regardless, you're looking at about ~O(3n) (all conditions being perfect), since the .Add() will do an additional, unnecessary but practically free, Contains() behind the scenes. I don't think it gets much better.

If you wanted to limit extra operations on large collections, you should sum up the Count of each dictionary you're about to merge and set the capacity of the the target dictionary to that, which avoids the later cost of resizing. So, end product is something like this...

    public static IDictionary<T1, T2> MergeAllOverwrite<T1, T2>(IList<IDictionary<T1, T2>> allDictionaries)
    {
        var initSize = allDictionaries.Sum(d => d.Count);
        var resultDictionary = new Dictionary<T1, T2>(initSize);
        allDictionaries.ForEach(resultDictionary.MergeOverwrite);
        return resultDictionary;
    }

Note that I took in an IList<T> to this method... mostly because if you take in an IEnumerable<T>, you've opened yourself up to multiple enumerations of the same set, which can be very costly if you got your collection of dictionaries from a deferred LINQ statement.

查看更多
孤独寂梦人
3楼-- · 2018-12-31 14:52

This partly depends on what you want to happen if you run into duplicates. For instance, you could do:

var result = dictionaries.SelectMany(dict => dict)
                         .ToDictionary(pair => pair.Key, pair => pair.Value);

That will blow up if you get any duplicate keys.

EDIT: If you use ToLookup then you'll get a lookup which can have multiple values per key. You could then convert that to a dictionary:

var result = dictionaries.SelectMany(dict => dict)
                         .ToLookup(pair => pair.Key, pair => pair.Value)
                         .ToDictionary(group => group.Key, group => group.First());

It's a bit ugly - and inefficient - but it's the quickest way to do it in terms of code. (I haven't tested it, admittedly.)

You could write your own ToDictionary2 extension method of course (with a better name, but I don't have time to think of one now) - it's not terribly hard to do, just overwriting (or ignoring) duplicate keys. The important bit (to my mind) is using SelectMany, and realising that a dictionary supports iteration over its key/value pairs.

查看更多
人气声优
4楼-- · 2018-12-31 14:53

The following works for me. If there are duplicates, it will use dictA's value.

public static IDictionary<TKey, TValue> Merge<TKey, TValue>(this IDictionary<TKey, TValue> dictA, IDictionary<TKey, TValue> dictB)
    where TValue : class
{
    return dictA.Keys.Union(dictB.Keys).ToDictionary(k => k, k => dictA.ContainsKey(k) ? dictA[k] : dictB[k]);
}
查看更多
只若初见
5楼-- · 2018-12-31 14:57
Dictionary<String, String> allTables = new Dictionary<String, String>();
allTables = tables1.Union(tables2).ToDictionary(pair => pair.Key, pair => pair.Value);
查看更多
忆尘夕之涩
6楼-- · 2018-12-31 14:57

Merging using an EqualityComparer that maps items for comparison to a different value/type. Here we will map from KeyValuePair (item type when enumerating a dictionary) to Key.

public class MappedEqualityComparer<T,U> : EqualityComparer<T>
{
    Func<T,U> _map;

    public MappedEqualityComparer(Func<T,U> map)
    {
        _map = map;
    }

    public override bool Equals(T x, T y)
    {
        return EqualityComparer<U>.Default.Equals(_map(x), _map(y));
    }

    public override int GetHashCode(T obj)
    {
        return _map(obj).GetHashCode();
    }
}

Usage:

// if dictA and dictB are of type Dictionary<int,string>
var dict = dictA.Concat(dictB)
                .Distinct(new MappedEqualityComparer<KeyValuePair<int,string>,int>(item => item.Key))
                .ToDictionary(item => item.Key, item=> item.Value);
查看更多
笑指拈花
7楼-- · 2018-12-31 14:58

I would do it like this:

dictionaryFrom.ToList().ForEach(x => dictionaryTo.Add(x.Key, x.Value));

Simple and easy. According to this blog post it's even faster than most loops as its underlying implementation accesses elements by index rather than enumerator (see this answer).

It will of course throw an exception if there are duplicates, so you'll have to check before merging.

查看更多
登录 后发表回答