What I have so far is largely based off page 571 of "Introduction To Algorithms" by Cormen et al.
I have a Node class in python that represents a set:
class Node:
def __init__(self, parent, rank = 0):
self.parent = parent
self.rank = rank
This implementation is going to use a List
of Nodes as the forest (I am open to better ways to store the sets).
Initialize()
returns a list of Nodes, that I will store in variable set
and pass into the other functions.
Find
searches through the forest for the value and returns the set that it appears in. I chose to use for s in range(len(set)):
so that in the recursion I could shrink the list of sets being passed in by set[s:]
.
def Find(set, value):
for s in range(len(set)):
if value != set[s].parent:
set[s].parent = Find(set[s:], set[s].parent)
return set[s]
Merge
merges sets in the forest by finding them and promoting the higher ranked set.
def Merge(set, value1, value2):
set_val_1 = Find(set, value1)
set_val_2 = Find(set, value2)
if set_val_1.rank > set_val_2.rank:
set_val_2.parent = set_val_1
else:
set_val_1.parent = set_val_2
if set_val_1.rank == set_val_2.rank:
set_val_2.rank += 1
I am getting some errors when I do Find
s and Merge
s, namely Find
is not returning the proper set, and so I am not sure if Merge
is working properly as well. I would appreciate some help to make sure the functions are implemented properly.
I don't have the latest edition of the book, but this doesn't look quite like a disjoint-set forest.
I think your mistake is to think that the forest has to be stored in a collection and that you have to traverse this collection to do the operations on the nodes. Remove
set
fromMerge()
andFind()
and implementFind()
asjust like in the book. Then add a
MakeSet()
that returns a single correctly initialized node, and maybe aSameSet()
function too:You now have a working disjoint set implementation.
Presuming that each node is initialised to be its own parent:
Have you looked at any other existing implementations?
The Wikipedia page provides pseudocode for the basic operations on disjoint-set data structure. Here's a direct port to Python (employs path compression and union by rank):
Demo:
Using this implementation as a starting point, I've created a new python class to handle disjoint sets, which also supports persistence using a MongoDb.
With my class you should be able, for example, to:
UnionFind()
, which uses python build-in dictionaries by default, and decide later toconsolidate()
your results in a MongoDb collectionYou might want to check the code on github.
Cheers, Simone