可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a random graph represented by an adjacency matrix in Java, how can I find the connected components (sub-graphs) within this graph?

I have found BFS and DFS but not sure they are suitable, nor could I work out how to implement them for an adjacency matrix.

Any ideas?

回答1:

You need to allocate marks - int array of length n, where n is the number of vertex in graph and fill it with zeros. Then:

1) For BFS do the following:

Components = 0;

Enumerate all vertices, if for vertex number i, marks[i] == 0 then

    ++Components;

    Put this vertex into queue, and 

    while queue is not empty, 

        pop vertex v from q

        marks[v] = Components;

        Put all adjacent vertices with marks equal to zero into queue.

2) For DFS do the following.

Components = 0;

Enumerate all vertices, if for vertex number i, marks[i] == 0 then

    ++Components;

    Call DFS(i, Components), where DFS is

    DFS(vertex, Components)
    {
        marks[vertex] = Components;
        Enumerate all vertices adjacent to vertex and 
        for all vertex j for which marks[j] == 0
            call DFS(j, Components);
    }

After performing any of this procedures, Components will have number of connected components, and for each vertex i, marks[i] will represent index of connected component i belongs.

Both complete on O(n) time, using O(n) memory, where n is matrix size. But I suggest you BFS as far as it doesn't suffer from stack overflow problem, and it doesn't spend time on recursive calls.

BFS code in Java:

  public static boolean[] BFS(boolean[][] adjacencyMatrix, int vertexCount, int givenVertex){
      // Result array.
      boolean[] mark = new boolean[vertexCount];

      Queue<Integer> queue = new LinkedList<Integer>();
      queue.add(givenVertex);
      mark[givenVertex] = true;

      while (!queue.isEmpty())
      {
        Integer current = queue.remove();

        for (int i = 0; i < vertexCount; ++i)
            if (adjacencyMatrix[current][i] && !mark[i])
            {
                mark[i] = true;
                queue.add(i);
            }
      }

      return mark;
  }


  public static void main(String[] args) {
      // Given adjacencyMatrix[x][y] if and only if there is a path between x and y.
      boolean[][] adjacencyMatrix = new boolean[][]
              {
                      {false,true,false,false,false},
                      {true,false,false,true,true},
                      {false,false,false,false,false},
                      {true,false,false,false,false},
                      {true,false,false,false,false}
              };
      // Mark[i] is true if and only if i belongs to the same connected component as givenVertex vertex does.
      boolean[] mark = BFS(adjacencyMatrix, 5, 0);

      for (int i = 0; i < 5; ++i)
          System.out.println(mark[i]);
}

回答2:

You can implement DFS iteratively with a stack, to eliminate the problems of recursive calls and call stack overflow. The implementation is very similar to BFS with queue - you just have to mark vertices when you pop them, not when you push them in the stack.

回答3:

Using scipy's sparse module,

Assuming your input is a dictionary from a (label_1,label_2) to weight you can run this code:

vertices, edges = dict2graph(cooccur_matrix, edge_threshold)
n, components = sparse.csgraph.connected_components(edges, directed=False)
print ('Found {n} components'.format(n=n))
components = collect_components(components,vertices)
components = [c for c in components if len(c)>=component_threshold]
print ('removed {k} small components'.format(k=n-len(components)))
print ('component sizes: '+ repr([len(c) for c in components]))

See full gist on github here