Networkx: extract the connected component containi

2019-02-06 05:26发布

问题:

I am trying to extract from a big graph the sub-graph of all connected nodes containing a specific node.

Is there a solution in the Networkx library?

[EDIT]
My graph is a DiGraph

[EDIT]
Rephrased simply:
I want the part of my graph that contain my specific node N_i and and all the nodes that are connected directly or indirectly (passing by other nodes) using any incoming or outcoming edges.
Example:

>>> g = nx.DiGraph()
>>> g.add_path(['A','B','C',])
>>> g.add_path(['X','Y','Z',])
>>> g.edges()
[('A', 'B'), ('B', 'C'), ('Y', 'Z'), ('X', 'Y')]

My desired result would be:

>>> g2 = getSubGraph(g, 'B')
>>> g2.nodes()
['A', 'B', 'C']
>>> g2.edges()
[('A', 'B'), ('B', 'C')]

回答1:

You can use shortest_path() to find all of the nodes reachable from a given node. In your case you need to first convert the graph to an undirected representation so both in- and out-edges are followed.

In [1]: import networkx as nx

In [2]: >>> g = nx.DiGraph()

In [3]: >>> g.add_path(['A','B','C',])

In [4]: >>> g.add_path(['X','Y','Z',])

In [5]: u = g.to_undirected()

In [6]: nodes = nx.shortest_path(u,'B').keys()

In [7]: nodes
Out[7]: ['A', 'C', 'B']

In [8]: s = g.subgraph(nodes)

In [9]: s.edges()
Out[9]: [('A', 'B'), ('B', 'C')]

Or in one line

In [10]: s = g.subgraph(nx.shortest_path(g.to_undirected(),'B'))

In [11]: s.edges()
Out[11]: [('A', 'B'), ('B', 'C')]


回答2:

Simply loop through the subgraphs until the target node is contained within the subgraph.

For directed graphs, I assume a subgraph is a graph such that every node is accessible from every other node. This is a strongly connected subgraph and the networkx function for that is strongly_connected_component_subgraphs.

(MWE) Minimal working example:

import networkx as nx
import pylab as plt

G = nx.erdos_renyi_graph(30,.05)
target_node = 13

pos=nx.graphviz_layout(G,prog="neato")

for h in nx.connected_component_subgraphs(G):
    if target_node in h:
        nx.draw(h,pos,node_color='red')
    else:
        nx.draw(h,pos,node_color='white')

plt.show()

For a directed subgraph (digraph) example change the corresponding lines to:

G = nx.erdos_renyi_graph(30,.05, directed=True)
...
for h in nx.strongly_connected_component_subgraphs(G):

Note that one of the nodes is in the connected component but not in the strongly connected component!



回答3:

Use the example at the end of the page connected_component_subgraphs.

Just ensure to refer the last element from the list rather than the first

>>> G=nx.path_graph(4)
>>> G.add_edge(5,6)
>>> H=nx.connected_component_subgraphs(G)[-1]


回答4:

I found three solution to solve your requirement, just same as mine. The size of my Digraph are between 6000 to 12000 nodes, and max subgraph size will up to 3700. Three function I used are:

def create_subgraph_dfs(G, node):
    """ bidirection, O(1)"""
    edges = nx.dfs_successors(G, node)
    nodes = []
    for k,v in edges.items():
        nodes.extend([k])
        nodes.extend(v)
    return G.subgraph(nodes)

def create_subgraph_shortpath(G, node):
    """ unidirection, O(1)"""
    nodes = nx.single_source_shortest_path(G,node).keys()
    return G.subgraph(nodes)

def create_subgraph_recursive(G, sub_G, start_node):
    """ bidirection, O(nlogn)"""
    for n in G.successors_iter(start_node):
        sub_G.add_path([start_node, n])
        create_subgraph_recursive(G, sub_G, n)

The test result is summary as follow: