Python: how to compute the number of shortest path

2019-07-14 20:15发布

Say that I have a regular network of NxN nodes. The shortest path between two nodes is the minimum number of hops required to reach one target node from a source node. Now, each shortest path passes through a number of nodes along the way.

My goal: for each node in the network, I want to count the number of shortest paths that pass through a specific node, and save that number in a dict.

In this little example, node B has 4 shortest paths passing through it: A -> B, A -> C, C -> B, C -> A. I want to be able to compute this number for each node in a generic graph. enter image description here

I know I could use the nx.betweenness_centrality(), but this will give me a numerator (which is, for each node, what I want) divided by a denominator which stores all the possible shortest paths between two nodes. I have even accessed the source code but I wasn't able to figure out where the division is performed.

I know this is a wordy question but I had no other means to explain my problem. Thank you to anyone who will help.

EDIT

This is the source code for nx.betweenness_centrality(). My graph is undirected. It is unclear which line hosts the division I introduced above:

def betweenness_centrality(G, k=None, normalized=True, weight=None,
                           endpoints=False,
                           seed=None): #G is the graph

    betweenness = dict.fromkeys(G, 0.0)  
    if k is None:
        nodes = G
    else:
        random.seed(seed)
        nodes = random.sample(G.nodes(), k)
    for s in nodes:
        # single source shortest paths
        if weight is None:  # use BFS
            S, P, sigma = _single_source_shortest_path_basic(G, s)
        else:  # use Dijkstra's algorithm
            S, P, sigma = _single_source_dijkstra_path_basic(G, s, weight)
        # accumulation
        if endpoints:
            betweenness = _accumulate_endpoints(betweenness, S, P, sigma, s)
        else:
            betweenness = _accumulate_basic(betweenness, S, P, sigma, s)
    # rescaling
    betweenness = _rescale(betweenness, len(G),
                           normalized=normalized,
                           directed=G.is_directed(),
                           k=k)
    return betweenness #Returns a dict with the node ID as a key and the value

4条回答
放我归山
2楼-- · 2019-07-14 20:36

You could just use nx.all_pairs_shortest_path(G):

>>> G = nx.Graph()
>>> G.add_path([0,1,2])

>>> spaths = nx.all_pairs_shortest_path(G)
>>> spaths
{0: {0: [0], 1: [0, 1], 2: [0, 1, 2]},
 1: {0: [1, 0], 1: [1], 2: [1, 2]},
 2: {0: [2, 1, 0], 1: [2, 1], 2: [2]}}

Which finds you all the shortest paths between all pair of nodes in a graph (which is very expensive for large graphs). Then, the following code gives you the desired result:

def num_spaths(G):
    n_spaths = dict.fromkeys(G, 0.0)
    spaths = nx.all_pairs_shortest_path(G)

    for source in G:
        for path in spaths[source].values():
            for node in path[1:]: # ignore firs element (source == node)
                n_spaths[node] += 1 # this path passes through `node`

    return n_spaths

In your example:

>>> num_spaths(G)
{0: 2.0, 1: 4.0, 2: 2.0}

Additionally, if you could go inside the all_pairs_shortest_path code and sightly edit it to add a counter for shortest paths and

for node in path[1:]:
    n_spaths[node] += 1

this way, you would update the number of paths online at the same time you find one, rather than having to iterate over all them (as my code does) after all them are calculated.

EDIT: in networkx (github), line 251 says:

paths[w]=paths[v]+[w]

You could potentially modify that function to update the number of shortest paths online (at the same time they are found) by passing n_spath as an argument to the modified function and updating it inside. Then calling the function would be as:

def num_spaths(G):
    n_spaths = dict.fromkeys(G, 0.0)
    for n in G:
        my_single_source_shortest_path(G, n, n_spaths)
    return n_spaths

and n_spaths would have the number of shortest path updated.


EDIT: The above only works if there is only 1 shortest path between A and B, as nx.all_pairs_shortest_path only returns one shortest paths per each node pair. Find bellow the brute force search for all posible shortest paths. I wouldn't recommend runing this in a large graph:

def bf_num_spaths(G):
    n_spaths = dict.fromkeys(G, 0.0)

    for source in G:
        for target in G:
            if source == target:
                continue
            for path in nx.all_shortest_paths(G, source, target):
                for node in path[1:]: # ignore firs element (source == node)
                    n_spaths[node] += 1 # this path passes through `node`

    return n_spaths
查看更多
你好瞎i
3楼-- · 2019-07-14 20:38

I'll try to give pure algorithmic answer.

What if there are several shortest paths from v_1 to v_2 and some of them contain v_mid? If this case is counted as 1 for each (v_1, v_2) pair, you can use Floyd-Warshall algorithm. After its run, you have matrix with length of shortest paths. For each u, iterate over all (v_1, v_2) pairs and count +1 if D[v_1, u] + D[u, v_2] == D[v_1, v_2]. It's O(n^3) and good for dense graphs.

I'm not sure but I think that it can be adjusted to count number of shortest paths.

查看更多
劳资没心,怎么记你
4楼-- · 2019-07-14 20:54

Without further access to the source code it is impossible to say for sure but my bet would be on:

S, P, sigma = _single_source_shortest_path_basic(G, s)

Or

betweenness = _rescale(betweenness, len(G),...

Rescale just sounds like it will contain a division.

Have you tried calling this function with normalized=False?

查看更多
We Are One
5楼-- · 2019-07-14 20:58

I believe you can modify the Floyd-Warshall Algorithm to save the length of the path in another matrix. Pseudocode (based on wikipedia):

let dist and num be two |V| × |V| arrays of minimum
distances and lengths initialized to ∞ (infinity)

for each vertex v
   dist[v][v] ← 0
   num[v][v] ← 0

for each edge (u,v)
   dist[u][v] ← w(u,v)  // the weight of the edge (u,v)
   num[u][v] ← 1

for k from 1 to |V|
    for i from 1 to |V|
        for j from 1 to |V|
            if dist[i][j] > dist[i][k] + dist[k][j] 
                dist[i][j] ← dist[i][k] + dist[k][j]
                num[i][j] ← num[i][k] + num[k][j]
            end if
查看更多
登录 后发表回答