I use Cython to wrap C++ code and expose it to Python for interactive work. My problem is that I need to read large graphs (several gigabytes) from file and they end up twice in the memory. Can anyone help me diagnose and solve this problem?
My Cython wrapper for the graph class looks like this:
cdef extern from "../src/graph/Graph.h":
cdef cppclass _Graph "Graph":
_Graph() except +
_Graph(count) except +
count numberOfNodes() except +
count numberOfEdges() except +
cdef class Graph:
"""An undirected, optionally weighted graph"""
cdef _Graph _this
def __cinit__(self, n=None):
if n is not None:
self._this = _Graph(n)
# any _thisect which appears as a return type needs to implement setThis
cdef setThis(self, _Graph other):
#del self._this
self._this = other
return self
def numberOfNodes(self):
return self._this.numberOfNodes()
def numberOfEdges(self):
return self._this.numberOfEdges()
If a Python Graph needs to be returned, it needs to be created empty and then the setThis
method is used to set the native _Graph
instance. This happens, for example, when a Graph
is read from file. This is the job of this class:
cdef extern from "../src/io/METISGraphReader.h":
cdef cppclass _METISGraphReader "METISGraphReader":
_METISGraphReader() except +
_Graph read(string path) except +
cdef class METISGraphReader:
""" Reads the METIS adjacency file format [1]
[1]: http://people.sc.fsu.edu/~jburkardt/data/metis_graph/metis_graph.html
"""
cdef _METISGraphReader _this
def read(self, path):
pathbytes = path.encode("utf-8") # string needs to be converted to bytes, which are coerced to std::string
return Graph(0).setThis(self._this.read(pathbytes))
Interactive usage looks like this:
>>> G = graphio.METISGraphReader().read("giant.metis.graph")
After the reading from file is done and X GB memory are used, there is a phase where obviously copying happens, and after that 2X GB memory are used. The entire memory is freed when del G
is called.
Where is my error which leads to the graph being copied and existing twice in memory?
You'll need to modify the C++ class to store it's data via a shared_ptr. Make sure you have a proper copy constructor and assignment operator:
If your constraint/goal is “Compute on graphs with billions of edges, in a reasonable time, on a single PC.”, consider refactoring to leverage GraphChi.
If single-machine/in-memory is not a constraint, consider leveraging a graph database like Neo4j instead of pulling all data into memory. There are also graph APIs that overlay with Hadoop (e.g. Apache Giraph).
I don't have a definitive answer for you, but I have a theory.
The Cython wrappers that you wrote are unusual, in that they wrap the C++ object directly instead of a pointer to it.
The following code is particularly inefficient:
The reason is that your
_Graph
class contains several STL vectors, and those will have to be copied over. So, when yourother
object is assigned toself._this
the memory usage is effectively doubled (or worse, since the STL allocators can overallocate for performance reasons).I wrote a simple test that matches yours and added logging everywhere to see how objects are created, copied or destroyed. I can't find any issues there. The copies do happen, but after the assignment is complete I see that only one object remains.
So my theory is that the extra memory that you see is related to STL allocator logic in the vectors. All that extra memory must be attached to the final object after the copies.
My recommendation is that you switch to the more standard pointer based wrapping. Your
_Graph
wrapper then should be defined more or less as follows:Note that I need to delete
_this
because it is a pointer.You will then need to modify your
METISGraphReader::read()
method to return a heap allocatedGraph
. The prototype of this method should be changed to:Then the Cython wrapper for it can be written as:
If you do it this way there is only one object, the one that is created on the heap by
read()
. A pointer to that object is returned to theread()
Cython wrapper, which then installs it in a brand newGraph()
instance. The only thing that gets copied is the 4 or 8 bytes of the pointer.I hope this helps!