I have traced a memory leak in my program to a Python module I wrote in C to efficiently parse an array expressed in ASCII-hex. (e.g. "FF 39 00 FC ...")
char* buf;
unsigned short bytesPerTable;
if (!PyArg_ParseTuple(args, "sH", &buf, &bytesPerTable))
{
return NULL;
}
unsigned short rowSize = bytesPerTable;
char* CArray = malloc(rowSize * sizeof(char));
// Populate CArray with data parsed from buf
ascii_buf_to_table(buf, bytesPerTable, rowSize, CArray);
int dims[1] = {rowSize};
PyObject* pythonArray = PyArray_SimpleNewFromData(1, (npy_intp*)dims, NPY_INT8, (void*)CArray);
return Py_BuildValue("(O)", pythonArray);
I realized that numpy does not know to free the memory allocated for CArray, thus causing a memory leak. After some research into this issue, at the suggestion of comments in this article I added the following line which is supposed to tell the array that it "owns" its data, and to free it when it is deleted.
PyArray_ENABLEFLAGS((PyArrayObject*)pythonArray, NPY_ARRAY_OWNDATA);
But I am still getting the memory leak. What am I doing wrong? How do I get the NPY_ARRAY_OWNDATA flag to work properly?
For reference, the documentation in ndarraytypes.h makes it seem like this should work:
/*
* If set, the array owns the data: it will be free'd when the array
* is deleted.
*
* This flag may be tested for in PyArray_FLAGS(arr).
*/
#define NPY_ARRAY_OWNDATA 0x0004
Also for reference, the following code (calling the Python function defined in C) demonstrates the memory leak.
tableData = "FF 39 00 FC FD 37 FF FF F9 38 FE FF F1 39 FE FC \n" \
"EF 38 FF FE 47 40 00 FB 3D 3B 00 FE 41 3D 00 FE \n" \
"43 3E 00 FF 42 3C FE 02 3C 40 FD 02 31 40 FE FF \n" \
"2E 3E FF FE 24 3D FF FE 15 3E 00 FC 0D 3C 01 FA \n" \
"02 3E 01 FE 01 3E 00 FF F7 3F FF FB F4 3F FF FB \n" \
"F1 3D FE 00 F4 3D FE 00 F9 3E FE FC FE 3E FD FE \n" \
"F6 3E FE 02 03 3E 00 FE 04 3E 00 FC 0B 3D 00 FD \n" \
"09 3A 00 01 03 3D 00 FD FB 3B FE FB FD 3E FD FF \n"
for i in xrange(1000000):
PES = ParseTable(tableData, 128, 4) //Causes memory usage to skyrocket
It's probably a reference-count issue (from How to extend NumPy):