Python/NetworkX: calculate edge weights on the fly

2020-07-27 16:30发布

问题:

I have an unweighted graph created with networkx for which I would like to calculate the weight of edges between nodes based on the count/frequency of an edge occurrence. An edge in my graph can occur more than once but the frequency of an edge appearance is not known in advance. The purpose is to visualize the edges based on the weight (e.g. count/frequency) of moves between connected nodes. Essentially, I'd like to create a network traffic map of movement between connected nodes, and visualize based on color or edge width. E.g., edge from node 0 to 1 has 10 movements between them, and node 1 to 2 has 5, so edge 0-1 would be visualized using a different edge color/size.

How can I calculate the weight of edges between two nodes, on the fly (after adding them to the graph with g.add_edges_from()), and then reapply to my graph for visualization? Below is a sample of my graph, data, and code I've used to create the graph initially and a solution I attempted that failed.

Graph

Sample Data

Cluster centroids(nodes)

cluster_label,latitude,longitude
0,39.18193382,-77.51885109
1,39.18,-77.27
2,39.17917928,-76.6688633
3,39.1782,-77.2617
4,39.1765,-77.1927
5,39.1762375,-76.8675441
6,39.17468,-76.8204499
7,39.17457332,-77.2807235
8,39.17406072,-77.274685
9,39.1731621,-77.2716502
10,39.17,-77.27

Trajectories(edges)

user_id,trajectory
11011.0,"[[340, 269], [269, 340]]"
80973.0,"[[398, 279]]"
608473.0,"[[69, 28]]"
2139671.0,"[[382, 27], [27, 285]]"
3945641.0,"[[120, 422], [422, 217], [217, 340], [340, 340]]"
5820642.0,"[[458, 442]]"
6060732.0,"[[291, 431]]"
6912362.0,"[[68, 27]]"
7362602.0,"[[112, 269]]"
8488782.0,"[[133, 340], [340, 340]]"

Code

import csv
import networkx as nx
import pandas as pd
import community
import matplotlib.pyplot as plt
import time
import mplleaflet

g = nx.MultiGraph()

df = pd.read_csv('cluster_centroids.csv', delimiter=',')
df['pos'] = list(zip(df.longitude,df.latitude))
dict_pos = dict(zip(df.cluster_label,df.pos))
#print dict_pos

for row in csv.reader(open('edges.csv', 'r')):
    if '[' in row[1]:       #
        g.add_edges_from(eval(row[1]))

# Plotting with mplleaflet
fig, ax = plt.subplots()
nx.draw_networkx_nodes(g,pos=dict_pos,node_size=50,node_color='b')
nx.draw_networkx_edges(g,pos=dict_pos,linewidths=0.01,edge_color='k', alpha=.05)
nx.draw_networkx_labels(g,dict_pos)
mplleaflet.show(fig=ax.figure)

I have tried using g.add_weighted_edges_from() and adding weight=1 as an attribute, but have not had any luck. I also tried using this which also did not work:

for u,v,d in g.edges():
    d['weight'] = 1
g.edges(data=True)
edges = g.edges()
weights = [g[u][v]['weight'] for u,v in edges]

回答1:

Since this went unanswered, a 2nd question on this topic was opened (here: Python/NetworkX: Add Weights to Edges by Frequency of Edge Occurance) which received responses. To add weights to edges based on count of edge occurrence:

g = nx.MultiDiGraph()

df = pd.read_csv('G:\cluster_centroids.csv', delimiter=',')
df['pos'] = list(zip(df.longitude,df.latitude))
dict_pos = dict(zip(df.cluster_label,df.pos))
#print dict_pos


for row in csv.reader(open('G:\edges.csv', 'r')):
    if '[' in row[1]:       #
        g.add_edges_from(eval(row[1]))

for u, v, d in g.edges(data=True):
    d['weight'] = 1
for u,v,d in g.edges(data=True):
    print u,v,d

To scale color and edge width based on the above count:

minLineWidth = 0.25

for u, v, d in g.edges(data=True):
    d['weight'] = c[u, v]*minLineWidth
edges,weights = zip(*nx.get_edge_attributes(g,'weight').items())

values = range(len(g.edges()) 
jet = cm = plt.get_cmap('YlOrRd')
cNorm  = colors.Normalize(vmin=0, vmax=values[-1])
scalarMap = cmx.ScalarMappable(norm=cNorm, cmap=jet)
colorList = []

for i in range(len(g.edges()):
    colorVal = scalarMap.to_rgba(values[i])
    colorList.append(colorVal)

and passing width=[d['weight'] for u,v, d in g.edges(data=True)], edge_color=colorList as arguments in nx.draw_networkx_edges()