Merging dictionaries in C#

2018-12-31 13:59发布

问题:

What\'s the best way to merge 2 or more dictionaries (Dictionary<T1,T2>) in C#? (3.0 features like LINQ are fine).

I\'m thinking of a method signature along the lines of:

public static Dictionary<TKey,TValue>
                 Merge<TKey,TValue>(Dictionary<TKey,TValue>[] dictionaries);

or

public static Dictionary<TKey,TValue>
                 Merge<TKey,TValue>(IEnumerable<Dictionary<TKey,TValue>> dictionaries);

EDIT: Got a cool solution from JaredPar and Jon Skeet, but I was thinking of something that handles duplicate keys. In case of collision, it doesn\'t matter which value is saved to the dict as long as it\'s consistent.

回答1:

This partly depends on what you want to happen if you run into duplicates. For instance, you could do:

var result = dictionaries.SelectMany(dict => dict)
                         .ToDictionary(pair => pair.Key, pair => pair.Value);

That will blow up if you get any duplicate keys.

EDIT: If you use ToLookup then you\'ll get a lookup which can have multiple values per key. You could then convert that to a dictionary:

var result = dictionaries.SelectMany(dict => dict)
                         .ToLookup(pair => pair.Key, pair => pair.Value)
                         .ToDictionary(group => group.Key, group => group.First());

It\'s a bit ugly - and inefficient - but it\'s the quickest way to do it in terms of code. (I haven\'t tested it, admittedly.)

You could write your own ToDictionary2 extension method of course (with a better name, but I don\'t have time to think of one now) - it\'s not terribly hard to do, just overwriting (or ignoring) duplicate keys. The important bit (to my mind) is using SelectMany, and realising that a dictionary supports iteration over its key/value pairs.



回答2:

I would do it like this:

dictionaryFrom.ToList().ForEach(x => dictionaryTo.Add(x.Key, x.Value));

Simple and easy. According to this blog post it\'s even faster than most loops as its underlying implementation accesses elements by index rather than enumerator (see this answer).

It will of course throw an exception if there are duplicates, so you\'ll have to check before merging.



回答3:

Well, I\'m late to the party, but here is what I use. It doesn\'t explode if there are multiple keys (\"righter\" keys replace \"lefter\" keys), can merge a number of dictionaries (if desired) and preserves the type (with the restriction that it requires a meaningful default public constructor):

public static class DictionaryExtensions
{
    // Works in C#3/VS2008:
    // Returns a new dictionary of this ... others merged leftward.
    // Keeps the type of \'this\', which must be default-instantiable.
    // Example: 
    //   result = map.MergeLeft(other1, other2, ...)
    public static T MergeLeft<T,K,V>(this T me, params IDictionary<K,V>[] others)
        where T : IDictionary<K,V>, new()
    {
        T newMap = new T();
        foreach (IDictionary<K,V> src in
            (new List<IDictionary<K,V>> { me }).Concat(others)) {
            // ^-- echk. Not quite there type-system.
            foreach (KeyValuePair<K,V> p in src) {
                newMap[p.Key] = p.Value;
            }
        }
        return newMap;
    }

}


回答4:

The trivial solution would be:

using System.Collections.Generic;
...
public static Dictionary<TKey, TValue>
    Merge<TKey,TValue>(IEnumerable<Dictionary<TKey, TValue>> dictionaries)
{
    var result = new Dictionary<TKey, TValue>();
    foreach (var dict in dictionaries)
        foreach (var x in dict)
            result[x.Key] = x.Value;
    return result;
}


回答5:

Try the following

static Dictionary<TKey, TValue>
    Merge<TKey, TValue>(this IEnumerable<Dictionary<TKey, TValue>> enumerable)
{
    return enumerable.SelectMany(x => x).ToDictionary(x => x.Key, y => y.Value);
}


回答6:

Dictionary<String, String> allTables = new Dictionary<String, String>();
allTables = tables1.Union(tables2).ToDictionary(pair => pair.Key, pair => pair.Value);


回答7:

The following works for me. If there are duplicates, it will use dictA\'s value.

public static IDictionary<TKey, TValue> Merge<TKey, TValue>(this IDictionary<TKey, TValue> dictA, IDictionary<TKey, TValue> dictB)
    where TValue : class
{
    return dictA.Keys.Union(dictB.Keys).ToDictionary(k => k, k => dictA.ContainsKey(k) ? dictA[k] : dictB[k]);
}


回答8:

Here is a helper function I use:

using System.Collections.Generic;
namespace HelperMethods
{
    public static class MergeDictionaries
    {
        public static void Merge<TKey, TValue>(this IDictionary<TKey, TValue> first, IDictionary<TKey, TValue> second)
        {
            if (second == null || first == null) return;
            foreach (var item in second) 
                if (!first.ContainsKey(item.Key)) 
                    first.Add(item.Key, item.Value);
        }
    }
}


回答9:

How about adding a params overload?

Also, you should type them as IDictionary for maximum flexibility.

public static IDictionary<TKey, TValue> Merge<TKey, TValue>(IEnumerable<IDictionary<TKey, TValue>> dictionaries)
{
    // ...
}

public static IDictionary<TKey, TValue> Merge<TKey, TValue>(params IDictionary<TKey, TValue>[] dictionaries)
{
    return Merge((IEnumerable<TKey, TValue>) dictionaries);
}


回答10:

I\'m very late to the party and perhaps missing something, but if either there are no duplicate keys or, as the OP says, \"In case of collision, it doesn\'t matter which value is saved to the dict as long as it\'s consistent,\" what\'s wrong with this one (merging D2 into D1)?

foreach (KeyValuePair<string,int> item in D2)
            {
                 D1[item.Key] = item.Value;
            }

It seems simple enough, maybe too simple, I wonder if I\'m missing something. This is what I\'m using in some code where I know there are no duplicate keys. I\'m still in testing, though, so I\'d love to know now if I\'m overlooking something, instead of finding out later.



回答11:

Considering the performance of dictionary key lookups and deletes since they are hash operations, and considering the wording of the question was best way, I think that below is a perfectly valid approach, and the others are a bit over-complicated, IMHO.

    public static void MergeOverwrite<T1, T2>(this IDictionary<T1, T2> dictionary, IDictionary<T1, T2> newElements)
    {
        if (newElements == null) return;

        foreach (var e in newElements)
        {
            dictionary.Remove(e.Key); //or if you don\'t want to overwrite do (if !.Contains()
            dictionary.Add(e);
        }
    }

OR if you\'re working in a multithreaded application and your dictionary needs to be thread safe anyway, you should be doing this:

    public static void MergeOverwrite<T1, T2>(this ConcurrentDictionary<T1, T2> dictionary, IDictionary<T1, T2> newElements)
    {
        if (newElements == null || newElements.Count == 0) return;

        foreach (var ne in newElements)
        {
            dictionary.AddOrUpdate(ne.Key, ne.Value, (key, value) => value);
        }
    }

You could then wrap this to make it handle an enumeration of dictionaries. Regardless, you\'re looking at about ~O(3n) (all conditions being perfect), since the .Add() will do an additional, unnecessary but practically free, Contains() behind the scenes. I don\'t think it gets much better.

If you wanted to limit extra operations on large collections, you should sum up the Count of each dictionary you\'re about to merge and set the capacity of the the target dictionary to that, which avoids the later cost of resizing. So, end product is something like this...

    public static IDictionary<T1, T2> MergeAllOverwrite<T1, T2>(IList<IDictionary<T1, T2>> allDictionaries)
    {
        var initSize = allDictionaries.Sum(d => d.Count);
        var resultDictionary = new Dictionary<T1, T2>(initSize);
        allDictionaries.ForEach(resultDictionary.MergeOverwrite);
        return resultDictionary;
    }

Note that I took in an IList<T> to this method... mostly because if you take in an IEnumerable<T>, you\'ve opened yourself up to multiple enumerations of the same set, which can be very costly if you got your collection of dictionaries from a deferred LINQ statement.



回答12:

Based on the answers above, but adding a Func-parameter to let the caller handle the duplicates:

public static Dictionary<TKey, TValue> Merge<TKey, TValue>(this IEnumerable<Dictionary<TKey, TValue>> dicts, 
                                                           Func<IGrouping<TKey, TValue>, TValue> resolveDuplicates)
{
    if (resolveDuplicates == null)
        resolveDuplicates = new Func<IGrouping<TKey, TValue>, TValue>(group => group.First());

    return dicts.SelectMany<Dictionary<TKey, TValue>, KeyValuePair<TKey, TValue>>(dict => dict)
                .ToLookup(pair => pair.Key, pair => pair.Value)
                .ToDictionary(group => group.Key, group => resolveDuplicates(group));
}


回答13:

The party\'s pretty much dead now, but here\'s an \"improved\" version of user166390 that made its way into my extension library. Apart from some details, I added a delegate to calculate the merged value.

/// <summary>
/// Merges a dictionary against an array of other dictionaries.
/// </summary>
/// <typeparam name=\"TResult\">The type of the resulting dictionary.</typeparam>
/// <typeparam name=\"TKey\">The type of the key in the resulting dictionary.</typeparam>
/// <typeparam name=\"TValue\">The type of the value in the resulting dictionary.</typeparam>
/// <param name=\"source\">The source dictionary.</param>
/// <param name=\"mergeBehavior\">A delegate returning the merged value. (Parameters in order: The current key, The current value, The previous value)</param>
/// <param name=\"mergers\">Dictionaries to merge against.</param>
/// <returns>The merged dictionary.</returns>
public static TResult MergeLeft<TResult, TKey, TValue>(
    this TResult source,
    Func<TKey, TValue, TValue, TValue> mergeBehavior,
    params IDictionary<TKey, TValue>[] mergers)
    where TResult : IDictionary<TKey, TValue>, new()
{
    var result = new TResult();
    var sources = new List<IDictionary<TKey, TValue>> { source }
        .Concat(mergers);

    foreach (var kv in sources.SelectMany(src => src))
    {
        TValue previousValue;
        result.TryGetValue(kv.Key, out previousValue);
        result[kv.Key] = mergeBehavior(kv.Key, kv.Value, previousValue);
    }

    return result;
}


回答14:

@Tim: Should be a comment, but comments don\'t allow for code editing.

Dictionary<string, string> t1 = new Dictionary<string, string>();
t1.Add(\"a\", \"aaa\");
Dictionary<string, string> t2 = new Dictionary<string, string>();
t2.Add(\"b\", \"bee\");
Dictionary<string, string> t3 = new Dictionary<string, string>();
t3.Add(\"c\", \"cee\");
t3.Add(\"d\", \"dee\");
t3.Add(\"b\", \"bee\");
Dictionary<string, string> merged = t1.MergeLeft(t2, t2, t3);

Note: I applied the modification by @ANeves to the solution by @Andrew Orsich, so the MergeLeft looks like this now:

public static Dictionary<K, V> MergeLeft<K, V>(this Dictionary<K, V> me, params IDictionary<K, V>[] others)
    {
        var newMap = new Dictionary<K, V>(me, me.Comparer);
        foreach (IDictionary<K, V> src in
            (new List<IDictionary<K, V>> { me }).Concat(others))
        {
            // ^-- echk. Not quite there type-system.
            foreach (KeyValuePair<K, V> p in src)
            {
                newMap[p.Key] = p.Value;
            }
        }
        return newMap;
    }


回答15:

I know this is an old question, but since we now have LINQ you can do it in a single line like this

Dictionary<T1,T2> merged;
Dictionary<T1,T2> mergee;
mergee.ToList().ForEach(kvp => merged.Add(kvp.Key, kvp.Value));

or

mergee.ToList().ForEach(kvp => merged.Append(kvp));


回答16:

Merging using an extension method. It does not throw exception when there are duplicate keys, but replaces those keys with keys from the second dictionary.

internal static class DictionaryExtensions
{
    public static Dictionary<T1, T2> Merge<T1, T2>(this Dictionary<T1, T2> first, Dictionary<T1, T2> second)
    {
        if (first == null) throw new ArgumentNullException(\"first\");
        if (second == null) throw new ArgumentNullException(\"second\");

        var merged = new Dictionary<T1, T2>();
        first.ToList().ForEach(kv => merged[kv.Key] = kv.Value);
        second.ToList().ForEach(kv => merged[kv.Key] = kv.Value);

        return merged;
    }
}

Usage:

Dictionary<string, string> merged = first.Merge(second);


回答17:

Merging using an EqualityComparer that maps items for comparison to a different value/type. Here we will map from KeyValuePair (item type when enumerating a dictionary) to Key.

public class MappedEqualityComparer<T,U> : EqualityComparer<T>
{
    Func<T,U> _map;

    public MappedEqualityComparer(Func<T,U> map)
    {
        _map = map;
    }

    public override bool Equals(T x, T y)
    {
        return EqualityComparer<U>.Default.Equals(_map(x), _map(y));
    }

    public override int GetHashCode(T obj)
    {
        return _map(obj).GetHashCode();
    }
}

Usage:

// if dictA and dictB are of type Dictionary<int,string>
var dict = dictA.Concat(dictB)
                .Distinct(new MappedEqualityComparer<KeyValuePair<int,string>,int>(item => item.Key))
                .ToDictionary(item => item.Key, item=> item.Value);


回答18:

or :

public static IDictionary<TKey, TValue> Merge<TKey, TValue>( IDictionary<TKey, TValue> x, IDictionary<TKey, TValue> y)
    {
        return x
            .Except(x.Join(y, z => z.Key, z => z.Key, (a, b) => a))
            .Concat(y)
            .ToDictionary(z => z.Key, z => z.Value);
    }

the result is a union where for duplicate entries \"y\" wins.



回答19:

Got scared to see complex answers, being new to C#.

Here are some simple answers.
Merging d1, d2, and so on.. dictionaries and handle any overlapping keys (\"b\" in below examples):

Example 1

{
    // 2 dictionaries,  \"b\" key is common with different values

    var d1 = new Dictionary<string, int>() { { \"a\", 10 }, { \"b\", 21 } };
    var d2 = new Dictionary<string, int>() { { \"c\", 30 }, { \"b\", 22 } };

    var result1 = d1.Concat(d2).GroupBy(ele => ele.Key).ToDictionary(ele => ele.Key, ele => ele.First().Value);
    // result1 is  a=10, b=21, c=30    That is, took the \"b\" value of the first dictionary

    var result2 = d1.Concat(d2).GroupBy(ele => ele.Key).ToDictionary(ele => ele.Key, ele => ele.Last().Value);
    // result2 is  a=10, b=22, c=30    That is, took the \"b\" value of the last dictionary
}

Example 2

{
    // 3 dictionaries,  \"b\" key is common with different values

    var d1 = new Dictionary<string, int>() { { \"a\", 10 }, { \"b\", 21 } };
    var d2 = new Dictionary<string, int>() { { \"c\", 30 }, { \"b\", 22 } };
    var d3 = new Dictionary<string, int>() { { \"d\", 40 }, { \"b\", 23 } };

    var result1 = d1.Concat(d2).Concat(d3).GroupBy(ele => ele.Key).ToDictionary(ele => ele.Key, ele => ele.First().Value);
    // result1 is  a=10, b=21, c=30, d=40    That is, took the \"b\" value of the first dictionary

    var result2 = d1.Concat(d2).Concat(d3).GroupBy(ele => ele.Key).ToDictionary(ele => ele.Key, ele => ele.Last().Value);
    // result2 is  a=10, b=23, c=30, d=40    That is, took the \"b\" value of the last dictionary
}

For more complex scenarios, see other answers.
Hope that helped.



回答20:

using System.Collections.Generic;
using System.Linq;

public static class DictionaryExtensions
{
    public enum MergeKind { SkipDuplicates, OverwriteDuplicates }
    public static void Merge<K, V>(this IDictionary<K, V> target, IDictionary<K, V> source, MergeKind kind = MergeKind.SkipDuplicates) =>
        source.ToList().ForEach(_ => { if (kind == MergeKind.OverwriteDuplicates || !target.ContainsKey(_.Key)) target[_.Key] = _.Value; });
}

You can either skip/ignore (default) or overwrite the duplicates: And Bob\'s your uncle provided you are not overly fussy about Linq performance but prefer instead concise maintainable code as I do: in which case you can remove the default MergeKind.SkipDuplicates to enforce a choice for the caller and make the developer cognisant of what the results will be!