I am trying to write a divide & conquer algorithm for trees. For the divide step I need an algorithm that partitions a given undirected Graph G=(V,E) with n nodes and m edges into sub-trees by removing a node. All subgraphs should have the property that they don't contain more than n/2 nodes (the tree should be split as equal as possible). First I tried to recursively remove all leaves from the tree to find the last remaining node, then I tried to find the longest path in G and remove the middle node of it. The given graph below shows that both approaches don't work:
Is there some working algorithm that does what I want (returns the node H in the above case).
This problem seems similar to finding the center of mass of an object. Assume each of your nodes is a point mass of equal mass (weight) and its position is given by the position in the graph. Your algorithm tries to find the center of mass, i.e. the node that has a similar accumulated weight of nodes in all connected sub-trees.
You may compute the accumulated weights on all sub-trees for each node. Then choose the one that is most balanced, s.t. no sub-tree weighs more than
n/2
. Probably this is a task for some dynamic programming.One exact algorithm is as this,
Start from leafs and create disjoint graphs (in fact all are K1), in each step find the parent of this leafs, and merge them into new tree, in each step if node
x
hasr
known child and degree of node isj
such thatj = r+1
, simply a node which is not in child node ofx
is parent of current node in this case we say nodex
isnice
, else, there are some child such that related rooted subtree of them not constructed, in this case we say the nodex
isbad
.So in each step connect
nice
nodes to their related parent, and it's obvious each step takessum of {degree of parent nice nodes}
also in each step you have at least one nice node (cause you start from leaf), So the algorithm is O(n), and it will be done completely, but for finding node which should be removed, In fact in each step is required to check size of a dijoint list (subtree lists), this can be done in O(1) in construction, Also if the size of list is equal or bigger than n/2, then select related nice node. (in fact find the nice node in minimum list which satisfies this condition).Obvious thing is that if is possible to divide tree in good way (each part has at most n/2 node) you can done it by this algorithm, but if is not so ( in fact you cant divide it in two or more part of size smaller than n/2) this gives you good approximation for it. Also as you can see there is no assumption in input tree.
note: I don't know is possible to have a tree such that it's impossible to partition it into some parts of size smaller than n/2 by removing one node.
Here is a method that I used and tested.
Begin by finding the root of your tree, you can do this by creating a set with all the nodes in them and then making another array NeighboursNumber[] with the number of neighbours of each node stored the corresponding index. Then iterate through the set and eliminate the leafs ( nodes i which have NeighboursNumber[i] == 1 ) , make sure you add those nodes to another set RemovedSet ( to avoid updating problems ) and then after each iteration go through the RemovedSet and decrease the NeighboursNumber[] entry for all neighbours of each element of the set. At the end you will have the root node . ( make sure you implement the case were 2 nodes with 1 neighbours are left ) . After finding the root we proceed to find the size of each subtree. The trick here is to do this while looking for the root :Before eliminating the leafs at the first iteration, start by creating an array SubTreeSize[] and each time we removing a node from the our set, we add the value of that node + 1 to the value of the parent : SubTreeSize[parent] = SubTreeSize[parent] + SubTreeSize[removedNode] + 1 ; this way, when we find the root , we also have the size of each subtree. Then we start from the root and check each neighbour it its subtree's size + 1 > nodes / 2 , if yes then we choose that node and start again. When all the children nodes sizes are <= nodes / 2 , break your loop and output that node.
This method took less than a second for a tree with 10^5 nodes.
I think you could do it with an algorithm like this:
Start in the root (if the tree isn't rooted, pick any node).
In each step, try to descend into the child node that has the largest subtree (the number of nodes “below” it is the biggest).
If doing that would make the number of nodes “above” bigger than n/2, stop, otherwise continue with that child.
This algorithm should be O(log n) if the tree is reasonably balanced and we have sizes of subtrees precomputed for each node. If one of those conditions doesn't apply, it would be O(n).