I have an unweighted, connected graph. I want to find a connected subgraph that definitely includes a certain set of nodes, and as few extras as possible. How could this be accomplished?
Just in case, I'll restate the question using more precise language. Let G(V,E) be an unweighted, undirected, connected graph. Let N be some subset of V. What's the best way to find the smallest connected subgraph G'(V',E') of G(V,E) such that N is a subset of V'?
Approximations are fine.
I can't think of an efficient algorithm to find the optimal solution, but assuming that your input graph is dense, the following might work well enough:
Convert your input graph
G(V, E)
to a weighted graphG'(N, D)
, whereN
is the subset of vertices you want to cover andD
is distances (path lengths) between corresponding vertices in the original graph. This will "collapse" all vertices you don't need into edges.Compute the minimum spanning tree for
G'
."Expand" the minimum spanning tree by the following procedure: for every edge
d
in the minimum spanning tree, take the corresponding path in graphG
and add all vertices (including endpoints) on the path to the result setV'
and all edges in the path to the result setE'
.This algorithm is easy to trip up to give suboptimal solutions. Example case: equilateral triangle where there are vertices at the corners, in midpoints of sides and in the middle of the triangle, and edges along the sides and from the corners to the middle of the triangle. To cover the corners it's enough to pick the single middle point of the triangle, but this algorithm might choose the sides. Nonetheless, if the graph is dense, it should work OK.
You could try to do the following:
Creating a minimal vertex-cover for the desired nodes
N
.Collapse these, possibly unconnected, sub-graphs into "large" nodes. That is, for each sub-graph, remove it from the graph, and replace it with a new node. Call this set of nodes
N'
.Do a minimal vertex-cover of the nodes in
N'
."Unpack" the nodes in
N'
.Not sure whether or not it gives you an approximation within some specific bound or so. You could perhaps even trick the algorithm to make some really stupid decisions.
This is exactly the well-known NP-hard Steiner Tree problem. Without more details on what your instances look like, it's hard to give advice on an appropriate algorithm.
The easiest solutions will be the following:
a) based on mst: - initially, all nodes of V are in V' - build a minimum spanning tree of the graph G(V,E) - call it T.
- loop: for every leaf v in T that is not in N, delete v from V'.
- repeat loop until all leaves in T are in N.
b) another solution is the following - based on shortest paths tree.
- pick any node in N, call it v, let v be a root of a tree T = {v}. - remove v from N.
At the beginning of the algorithm: compute all shortest paths in G using any known efficient algorithm.
Personally, I used this algorithm in one of my papers, but it is more suitable for distributed enviroments. Let N be the set of nodes that we need to interconnect. We want to build a minimum connected dominating set of the graph G, and we want to give priority for nodes in N. We give each node u a unique identifier id(u). We let w(u) = 0 if u is in N, otherwise w(1). We create pair (w(u), id(u)) for each node u.
each node u builds a multiset relay node. That is, a set M(u) of 1-hop neigbhors such that each 2-hop neighbor is a neighbor to at least one node in M(u). [the minimum M(u), the better is the solution].
u is in V' if and only if: u has the smallest pair (w(u), id(u)) among all its neighbors. or u is selected in the M(v), where v is a 1-hop neighbor of u with the smallest (w(u),id(u)).
-- the trick when you execute this algorithm in a centralized manner is to be efficient in computing 2-hop neighbors. The best I could get from O(n^3) is to O(n^2.37) by matrix multiplication.
-- I really wish to know what is the approximation ration of this last solution.
I like this reference for heuristics of steiner tree: The Steiner tree problem, Hwang Frank ; Richards Dana 1955- Winter Pawel 1952