I tried finding the answers of few selected answers, but they doesn't seem working. I have to merge list of lists.
[[283, 311], [311, 283, 316], [150, 68], [86, 119], [259, 263], [263, 259, 267], [118, 87], [262, 264], [264, 262], [85, 115], [115, 85], [244, 89], [140, 76, 82, 99], [236, 111], [330, 168], [76, 63, 107, 124, 128, 135, 140], [131, 254], [254, 131], [21, 38], [38, 21], [220, 291], [291, 220], [296, 46], [64, 53, 57, 61, 63, 65, 66, 76, 96, 100, 103, 114, 123, 127, 128, 130, 148, 149], [274, 240], [157, 225, 234], [225, 247], [233, 44], [89, 244], [80, 101], [210, 214], [78, 155], [55, 139], [102, 74, 75, 132], [105, 252], [149, 55, 59, 63, 71, 73, 81, 100, 102, 116, 122, 138, 146], [97, 231], [231, 97], [155, 78], [239, 305], [305, 239], [145, 94, 248], [147, 150], [61, 64], [152, 219], [219, 152], [246, 250], [252, 105], [223, 235], [235, 223], [237, 60, 344], [344, 237], [182, 129], [331, 117], [12, 2, 8, 10, 13, 15], [250, 246]]
All I want is, for example, [283,311]
exists in [311,283,316]
. then both should be merged making only one list in the above list. I need to merge a list if it exists in some other.
Please note, I can do it using loop inside loop, but looking for a pythonic way to achieve this. Also, if you know how to merge lists sharing atleaast one common element, please share.
Edit :
Please don't forget, looking for a pythonic approach. I think it is not possible using any core Pythonic approach. Then what should be next possible solution using loops. I need efficiency as have to merge list of lists with more than a million items every half an hour. I did using for loop inside for loop then comparing two items, but it is taking too much of time.
I think this works ...
Note that this gives you a list of
frozenset
objects, but you can easily turn that back into a list of lists:Unfortunately, it has quadratic complexity ... I'm not sure if there's anything to really be done about that ... (keeping it as lists would result in much worse complexity FWIW)
Here's a second approach which is a little better algorithmically:
Of course, It would need to be timed on you real data to see if it actually performs better (for the test data, the lists aren't big enough and the overhead of generating the
conditional
function and thetakewhile
instance and thelambda
make it not worthwhile.for the curious, here's my benchmark:
To answer your latter question; it's a connected-components problem.
Here's an alternative approach that I think should be faster than @mgilson's if you have a large number of small sets:
The difference is mainly in the final loop, which runs not through all n(n-1)/2 distinct pairs of sets, but only though the n sets in the outer loop, then through a set of candidate supersets that contain some element of the set under consideration. It can be optimized further by stopping the
reduce
early when it produces an empty set.