Cython: How to move large objects without copying

2020-07-10 08:49发布

I use Cython to wrap C++ code and expose it to Python for interactive work. My problem is that I need to read large graphs (several gigabytes) from file and they end up twice in the memory. Can anyone help me diagnose and solve this problem?

My Cython wrapper for the graph class looks like this:

cdef extern from "../src/graph/Graph.h":
    cdef cppclass _Graph "Graph":
        _Graph() except +
        _Graph(count) except +
        count numberOfNodes() except +
        count numberOfEdges() except +


cdef class Graph:
    """An undirected, optionally weighted graph"""
    cdef _Graph _this

    def __cinit__(self, n=None):
        if n is not None:
            self._this = _Graph(n)

    # any _thisect which appears as a return type needs to implement setThis
    cdef setThis(self, _Graph other):
        #del self._this
        self._this = other
        return self

    def numberOfNodes(self):
        return self._this.numberOfNodes()

    def numberOfEdges(self):
        return self._this.numberOfEdges()

If a Python Graph needs to be returned, it needs to be created empty and then the setThis method is used to set the native _Graph instance. This happens, for example, when a Graph is read from file. This is the job of this class:

cdef extern from "../src/io/METISGraphReader.h":
    cdef cppclass _METISGraphReader "METISGraphReader":
        _METISGraphReader() except +
        _Graph read(string path) except +

cdef class METISGraphReader:
    """ Reads the METIS adjacency file format [1]
        [1]: http://people.sc.fsu.edu/~jburkardt/data/metis_graph/metis_graph.html
    """
    cdef _METISGraphReader _this

    def read(self, path):
        pathbytes = path.encode("utf-8") # string needs to be converted to bytes, which are coerced to std::string
        return Graph(0).setThis(self._this.read(pathbytes))

Interactive usage looks like this:

 >>> G = graphio.METISGraphReader().read("giant.metis.graph")

After the reading from file is done and X GB memory are used, there is a phase where obviously copying happens, and after that 2X GB memory are used. The entire memory is freed when del G is called.

Where is my error which leads to the graph being copied and existing twice in memory?

3条回答
别忘想泡老子
2楼-- · 2020-07-10 09:14

You'll need to modify the C++ class to store it's data via a shared_ptr. Make sure you have a proper copy constructor and assignment operator:

#define  _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <memory>

struct Data { // your graph data
    Data(const char* _d = NULL) {
        if (_d)
            strncpy(d, _d, sizeof(d)-1);
        else
            memset(d, 0, sizeof(d));
    }
    Data(const Data& rhs) {
        memcpy(d, rhs.d, sizeof(d));
    }
    ~Data() {
        memset(d, 0, sizeof(d));
    }
    void DoSomething() { /* do something */ } // a public method that was used in Python

    char d[1024];
};

class A { // the wrapper class
public:
    A() {}
    A(const char* name)  : pData(new Data(name)) {}
    A(const A& rhs) : pData(rhs.pData) {}
    A& operator=(const A& rhs) {
        pData = rhs.pData;
        return *this;
    }
    ~A() {}
    // interface with Data
    void DoSomething() {
        if (pData.get() != NULL)
            pData->DoSomething();
    }

private:
    std::shared_ptr<Data> pData;
};

int main(int argc, char** argv)
{
    A o1("Hello!");
    A o2(o1);
    A o3;
    o3 = o2;
    return 0;
}
查看更多
别忘想泡老子
3楼-- · 2020-07-10 09:25

If your constraint/goal is “Compute on graphs with billions of edges, in a reasonable time, on a single PC.”, consider refactoring to leverage GraphChi.

If single-machine/in-memory is not a constraint, consider leveraging a graph database like Neo4j instead of pulling all data into memory. There are also graph APIs that overlay with Hadoop (e.g. Apache Giraph).

查看更多
够拽才男人
4楼-- · 2020-07-10 09:32

I don't have a definitive answer for you, but I have a theory.

The Cython wrappers that you wrote are unusual, in that they wrap the C++ object directly instead of a pointer to it.

The following code is particularly inefficient:

cdef setThis(self, _Graph other):
    self._this = other
    return self 

The reason is that your _Graph class contains several STL vectors, and those will have to be copied over. So, when your other object is assigned to self._this the memory usage is effectively doubled (or worse, since the STL allocators can overallocate for performance reasons).

I wrote a simple test that matches yours and added logging everywhere to see how objects are created, copied or destroyed. I can't find any issues there. The copies do happen, but after the assignment is complete I see that only one object remains.

So my theory is that the extra memory that you see is related to STL allocator logic in the vectors. All that extra memory must be attached to the final object after the copies.

My recommendation is that you switch to the more standard pointer based wrapping. Your _Graph wrapper then should be defined more or less as follows:

cdef class Graph:
    """An undirected, optionally weighted graph"""
    cdef _Graph* _this

    def __cinit__(self, n=None):
        if n is not None:
            self._this = new _Graph(n)
        else:
            self._this = 0

    cdef setThis(self, _Graph* other):
        del self._this
        self._this = other
        return self

    def __dealloc__(self):
        del self._this

Note that I need to delete _this because it is a pointer.

You will then need to modify your METISGraphReader::read() method to return a heap allocated Graph. The prototype of this method should be changed to:

Graph* METISGraphReader::read(std::string path);

Then the Cython wrapper for it can be written as:

    def read(self, path):
        pathbytes = path.encode("utf-8") # string needs to be converted to bytes, which are coerced to std::string
        return Graph().setThis(self._this.read(pathbytes))

If you do it this way there is only one object, the one that is created on the heap by read(). A pointer to that object is returned to the read() Cython wrapper, which then installs it in a brand new Graph() instance. The only thing that gets copied is the 4 or 8 bytes of the pointer.

I hope this helps!

查看更多
登录 后发表回答