I carefully read the docs, but it still is unclear to me how to use G.forEdges(), described as an "experimental edge iterator interface".
Let's say that I want to decrease the density of my graph. I have a sorted list of weights, and I want to remove edges based on their weight until the graph splits into two connected components. Then I'll select the minimum number of links that keeps the graph connected. I would do something like this:
cc = components.ConnectedComponents(G).run()
while cc.numberOfComponents()==1:
for weight in weightlist:
for (u,v) in G.edges():
if G.weight(u,v)==weight:
G=G.removeEdge(u,v)
By the way I know from the docs that there is this edge iterator, which probably does the iteration in a more efficient way. But from the docs I really can't understand how to correctly use this forEdges
, and I can't find a single example over the internet. Any ideas?
Or maybe an alternative idea to do what I want to do: since it's a huge graph (125millions links) the iteration will take forever, even if I am working on a cluster.
NetworKit iterators accept a callback function so if you want to iterate over edges (or nodes) you have to define a function and then pass it to the iterator as a parameter. You can find more information here. For example a simple function that just prints all edges is:
# Callback function.
# To iterate over edges it must accept 4 parameters
def myFunction(u, v, weight, edgeId):
print("Edge from {} to {} has weight {} and id {}".format(u, v, weight, edgeId))
# Using iterator with callback function
G.forEdges(myFunction)
Now if you want to keep removing edges whose weight is inside your weightlist until the graph splits into two connected components you also have to update the connected components of the graph since ConnectedComponents will not do that for you automatically (this may be also one of the reasons why the iteration takes forever). To do this efficiently, you can use the DynConnectedComponents class (see my example below). In this case, I think that the edge iterator will not help you much so I would suggest you to keep using the for loop.
from networkit import *
# Efficiently updates connected components after edge updates
cc = components.DynConnectedComponents(G).run()
# Removes edges with weight equals to w until components split
def removeEdges(w):
for (u, v) in G.edges():
if G.weight(u, v) == weight:
G.removeEdge(u, v)
# Updating connected components
event = dynamic.GraphEvent(dynamic.GraphEvent.EDGE_REMOVAL, u, v, weight)
cc.update(event)
if cc.numberOfComponents() > 1:
# Components did split
return True
# Components did not split
return False
if cc.numberOfComponents() == 1:
for weight in weights:
if removeEdges(weight):
break
This should speed up a bit your original code. However, it is still sequential code so even if you run it on a multi-core machine it will use only one core.