Python C Wrapper Memory Leak

2019-09-17 20:03发布

I am moderately experienced in python and C but new to writing python modules as wrappers on C functions. For a project I needed one function named "score" to run much faster than I was able to get in python so I coded it in C and literally just want to be able to call it from python. It takes in a python list of integers and I want the C function to get an array of integers, the length of that array, and then return an integer back to python. Here is my current (working) solution.

static PyObject *module_score(PyObject *self, PyObject *args) {
    int i, size, value, *gene;
    PyObject *seq, *data;

    /* Parse the input tuple */
    if (!PyArg_ParseTuple(args, "O", &data))
        return NULL;
    seq = PySequence_Fast(data, "expected a sequence");
    size = PySequence_Size(seq);

    gene = (int*) PyMem_Malloc(size * sizeof(int));
    for (i = 0; i < size; i++)
        gene[i] = PyInt_AsLong(PySequence_Fast_GET_ITEM(seq, i));

    /* Call the external C function*/
    value = score(gene, size);

    PyMem_Free(gene);

    /* Build the output tuple */
    PyObject *ret = Py_BuildValue("i", value);
    return ret;
}

This works but seems to leak memory and at a rate I can't ignore. I made sure that the leak is happening in the shown function by temporarily making the score function just return 0 and still saw the leaking behavior. I had thought that the call to PyMem_Free should take care of the PyMem_Malloc'ed storage but my current guess is that something in this function is getting allocated and retained on each call since the leaking behavior is proportional to the number of calls to this function. Am I not doing the sequence to array conversion correctly or am I possibly returning the ending value inefficiently? Any help is appreciated.

1条回答
疯言疯语
2楼-- · 2019-09-17 20:13

seq is a new Python object so you will need delete that object. You should check if seq is NULL, too.

Something like (untested):

static PyObject *module_score(PyObject *self, PyObject *args) {
    int i, size, value, *gene;
    long temp;
    PyObject *seq, *data;

    /* Parse the input tuple */
    if (!PyArg_ParseTuple(args, "O", &data))
        return NULL;
    if (!(seq = PySequence_Fast(data, "expected a sequence")))
        return NULL;

    size = PySequence_Size(seq);

    gene = (int*) PyMem_Malloc(size * sizeof(int));
    for (i = 0; i < size; i++) {
        temp = PyInt_AsLong(PySequence_Fast_GET_ITEM(seq, i));
        if (temp == -1 && PyErr_Occurred()) {
            Py_DECREF(seq);
            PyErr_SetString(PyExc_ValueError, "an integer value is required");
            return NULL;
        }
        /* Do whatever you need to verify temp will fit in an int */
        gene[i] = (int*)temp;
    }

    /* Call the external C function*/
    value = score(gene, size);

    PyMem_Free(gene);
    Py_DECREF(seq):

    /* Build the output tuple */
    PyObject *ret = Py_BuildValue("i", value);
    return ret;
}
查看更多
登录 后发表回答