Merging sets with common elements? [duplicate]

2019-05-22 22:47发布

This question already has an answer here:

I would like to merge sets with common elements. For example

input = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]),  frozenset([1,1000]),
             frozenset([100, 200]), frozenset([100, 300, 400])])

Result:

set([frozenset([1,2,3,4,5,6,7,8, 1000]), frozenset([100,200,300,400])])

What would be an efficient way to achieve this?

标签: python set
2条回答
Emotional °昔
2楼-- · 2019-05-22 23:13

Here are what I tried. Firstly, I did something simple but wrong

result = set()
while input:                                                                   
    r = set(input.pop())
    for rr in input.copy():
        if rr & r:                                                  
            input.remove(rr)
            r.update(rr)  
    result.add(frozenset(r))   

The problem with this code is that the merging may not be complete, depending on the order of the input.pop(). Suppose input={frozenset([0, 1]), frozenset([2,3]), frozenset([1,2])} and the three elements get popped out and looped over in this assignment order, then results={frozenset([0,1,2]), frozenset([2,3])}.

Then I implemented the answer in this post. It first builds an adjacency list of a new graph, each node corresponds to one frozenset element of the input. An edge exists between two nodes (frozensets) if they share common elements. Then depth first graph traversal is used to find the connected components of this new graph.

result, visited = set(), set()
components = collections.defaultdict(list)
adj_list = collections.defaultdict(list)

def dft(node, key):
    visited.add(node)
    components[key].append(node)
    for neighbor in adj_list[node]:
        if neighbor not in visited:
            dft(neighbor, key)

for r1, r2 in itertools.combinations_with_replacement(input, 2):
    if r1 & r2:
        adj_list[r1].append(r2)
        adj_list[r2].append(r1)
for node in adj_list:
    if node not in visited:
        dft(node, node)

for node, neighbors in components.iteritems():
    result.add(node.union(*neighbors))
查看更多
干净又极端
3楼-- · 2019-05-22 23:17

Use the built-in set.union() -> only include results where the length of set.intersection() is greater than 0.


This answer explains at a high level how to implement what you want. If you want a specific solution, you'll have to show your efforts to solve the problem first.

查看更多
登录 后发表回答