I have a random graph represented by an adjacency matrix in Java, how can I find the connected components (sub-graphs) within this graph?
I have found BFS and DFS but not sure they are suitable, nor could I work out how to implement them for an adjacency matrix.
Any ideas?
You need to allocate marks - int array of length n, where n is the number of vertex in graph and fill it with zeros. Then:
1) For BFS do the following:
Components = 0;
Enumerate all vertices, if for vertex number i, marks[i] == 0 then
++Components;
Put this vertex into queue, and
while queue is not empty,
pop vertex v from q
marks[v] = Components;
Put all adjacent vertices with marks equal to zero into queue.
2) For DFS do the following.
Components = 0;
Enumerate all vertices, if for vertex number i, marks[i] == 0 then
++Components;
Call DFS(i, Components), where DFS is
DFS(vertex, Components)
{
marks[vertex] = Components;
Enumerate all vertices adjacent to vertex and
for all vertex j for which marks[j] == 0
call DFS(j, Components);
}
After performing any of this procedures, Components will have number of connected components,
and for each vertex i, marks[i] will represent index of connected component i belongs.
Both complete on O(n) time, using O(n) memory, where n is matrix size. But I suggest you BFS as far as it doesn't suffer from stack overflow problem, and it doesn't spend time on recursive calls.
BFS code in Java:
public static boolean[] BFS(boolean[][] adjacencyMatrix, int vertexCount, int givenVertex){
// Result array.
boolean[] mark = new boolean[vertexCount];
Queue<Integer> queue = new LinkedList<Integer>();
queue.add(givenVertex);
mark[givenVertex] = true;
while (!queue.isEmpty())
{
Integer current = queue.remove();
for (int i = 0; i < vertexCount; ++i)
if (adjacencyMatrix[current][i] && !mark[i])
{
mark[i] = true;
queue.add(i);
}
}
return mark;
}
public static void main(String[] args) {
// Given adjacencyMatrix[x][y] if and only if there is a path between x and y.
boolean[][] adjacencyMatrix = new boolean[][]
{
{false,true,false,false,false},
{true,false,false,true,true},
{false,false,false,false,false},
{true,false,false,false,false},
{true,false,false,false,false}
};
// Mark[i] is true if and only if i belongs to the same connected component as givenVertex vertex does.
boolean[] mark = BFS(adjacencyMatrix, 5, 0);
for (int i = 0; i < 5; ++i)
System.out.println(mark[i]);
}
You can implement DFS iteratively with a stack, to eliminate the problems of recursive calls and call stack overflow. The implementation is very similar to BFS with queue - you just have to mark vertices when you pop them, not when you push them in the stack.
Using scipy's sparse module,
Assuming your input is a dictionary from a (label_1,label_2)
to weight
you can run this code:
vertices, edges = dict2graph(cooccur_matrix, edge_threshold)
n, components = sparse.csgraph.connected_components(edges, directed=False)
print ('Found {n} components'.format(n=n))
components = collect_components(components,vertices)
components = [c for c in components if len(c)>=component_threshold]
print ('removed {k} small components'.format(k=n-len(components)))
print ('component sizes: '+ repr([len(c) for c in components]))
See full gist on github here