Python remove duplicate value in a combined dictio

2019-02-25 06:06发布

问题:

I need a little bit of homework help. I have to write a function that combines several dictionaries into new dictionary. If a key appears more than once; the values corresponding to that key in the new dictionary should be a unique list. As an example this is what I have so far:

f = {'a': 'apple', 'c': 'cat', 'b': 'bat', 'd': 'dog'}
g =  {'c': 'car', 'b': 'bat', 'e': 'elephant'}
h = {'b': 'boy', 'd': 'deer'}
r = {'a': 'adam'}

def merge(*d):
    newdicts={}
    for dict in d:
        for k in dict.items():
            if k[0] in newdicts:
                newdicts[k[0]].append(k[1])
            else:
                newdicts[k[0]]=[k[1]]
    return newdicts

combined = merge(f, g, h, r)
print(combined)

The output looks like:

{'a': ['apple', 'adam'], 'c': ['cat', 'car'], 'b': ['bat', 'bat', 'boy'], 'e': ['elephant'], 'd': ['dog', 'deer']}

Under the 'b' key, 'bat' appears twice. How do I remove the duplicates?

I've looked under filter, lambda but I couldn't figure out how to use with (maybe b/c it's a list in a dictionary?)

Any help would be appreciated. And thank you in advance for all your help!

回答1:

Just test for the element inside the list before adding it: -

for k in dict.items():
    if k[0] in newdicts:
        if k[1] not in newdicts[k[0]]:  # Do this test before adding.
            newdicts[k[0]].append(k[1])
    else:
        newdicts[k[0]]=[k[1]]

And since you want just unique elements in the value list, then you can just use a Set as value instead. Also, you can use a defaultdict here, so that you don't have to test for key existence before adding.

Also, don't use built-in for your as your variable names. Instead of dict some other variable.

So, you can modify your merge method as:

from collections import defaultdict

def merge(*d):
    newdicts = defaultdict(set)  # Define a defaultdict
    for each_dict in d:

        # dict.items() returns a list of (k, v) tuple.
        # So, you can directly unpack the tuple in two loop variables.
        for k, v in each_dict.items():  
            newdicts[k].add(v)

    # And if you want the exact representation that you have shown   
    # You can build a normal dict out of your newly built dict.
    unique = {key: list(value) for key, value in newdicts.items()}
    return unique


回答2:

>>> import collections
>>> import itertools
>>> uniques = collections.defaultdict(set)
>>> for k, v in itertools.chain(f.items(), g.items(), h.items(), r.items()):
...   uniques[k].add(v)
... 
>>> uniques
defaultdict(<type 'set'>, {'a': set(['apple', 'adam']), 'c': set(['car', 'cat']), 'b':        set(['boy', 'bat']), 'e': set(['elephant']), 'd': set(['deer', 'dog'])})

Note the results are in a set, not a list -- far more computationally efficient this way. If you would like the final form to be lists then you can do the following:

>>> {x: list(y) for x, y in uniques.items()}

{'a': ['apple', 'adam'], 'c': ['car', 'cat'], 'b': ['boy', 'bat'], 'e': ['elephant'], 'd': ['deer', 'dog']}



回答3:

In your for loop add this:

for dict in d:
    for k in dict.items():
        if k[0] in newdicts:
            # This line below
            if k[1] not in newdicts[k[0]]:
                newdicts[k[0]].append(k[1])
        else:
            newdicts[k[0]]=[k[1]]

This makes sure duplicates aren't added



回答4:

Use set when you want unique elements:

def merge_dicts(*d):
    result={}
    for dict in d:
        for key, value in dict.items():
          result.setdefault(key, set()).add(value)
    return result

Try to avoid using indices; unpack tuples instead.