What is best structure to choose for updaing nodes

2020-03-07 14:19发布

its a while that i was searching a way to update nodes property in GraphX. i am working on a graph that consists of nodes and nodes property. for example (1,(2,true)). in this example 1 is the nodeID, 2 is node's label and true stands for when node has been visited. i have loaded graph with GraphLoader and made a distributed graph by RDDs.

The structure that i am using for every node is as below:

case class nodes_properties(label: Int, isVisited: Boolean = false)
      var work_graph = graph.mapVertices { case (node, property) => nodes_properties(node.toInt, false) }.cache()

And when i want to update a nodes property (for example its label), i use the following structure:

work_graph = work_graph.mapVertices((vid: VertexId, v: nodes_properties) => {
              if (vid == my_node) nodes_properties(newLabel,true)
              else v
            })

this structure does what i want, but as i see, its so costly in computation and just for a graph with 30000 nodes, it takes about 4 minutes while when i use MATLAB for doing the same operations, it takes about 25 seconds.

Question: Is there any good structure or any efficient and ideal method for updating property of nodes in graph during the algorithm? its really a bottleneck for me and i am not able to solve this.

i should mention that the algorithm has iterative nature and at each iteration i need to update nodes properties based on some conditions.

NOTE: i use unpersistVertices() and graph.checkpoint() but again this method that i have is so time consuming in updating nodes properties!

0条回答
登录 后发表回答