its a while that i was searching a way to update nodes property in GraphX. i am working on a graph that consists of nodes and nodes property. for example (1,(2,true)). in this example 1 is the nodeID, 2 is node's label and true stands for when node has been visited. i have loaded graph with GraphLoader and made a distributed graph by RDDs.
The structure that i am using for every node is as below:
case class nodes_properties(label: Int, isVisited: Boolean = false)
var work_graph = graph.mapVertices { case (node, property) => nodes_properties(node.toInt, false) }.cache()
And when i want to update a nodes property (for example its label), i use the following structure:
work_graph = work_graph.mapVertices((vid: VertexId, v: nodes_properties) => {
if (vid == my_node) nodes_properties(newLabel,true)
else v
})
this structure does what i want, but as i see, its so costly in computation and just for a graph with 30000 nodes, it takes about 4 minutes while when i use MATLAB for doing the same operations, it takes about 25 seconds.
Question: Is there any good structure or any efficient and ideal method for updating property of nodes in graph during the algorithm? its really a bottleneck for me and i am not able to solve this.
i should mention that the algorithm has iterative nature and at each iteration i need to update nodes properties based on some conditions.
NOTE: i use unpersistVertices()
and graph.checkpoint()
but again this method that i have is so time consuming in updating nodes properties!