While using an open source Cython library I found a memory leak. The leak seems to come from a typed numpy array, which is not freed from the memory when it goes out of scope. The declaration is the following:
cdef np.ndarray[object, ndim=1] my_array = np.empty(my_size, dtype=object)
In my understanding, this should be considered by the garbage collector like any other numpy array and the GC should free its memory as soon as the array goes out of scope -- in this case at the end of the function in which it is declared. Apparently this does not happen.
If the array were created using a cython array first, and then casting it to numpy array, one could use the callback_free_data function like described here and here. However, in this case it is not possible to reach the pointers of my_array
and it is not possible to set the callback.
Any idea on why this kind of declaration could cause a memory leak and/or how to force the deallocation?
Update:
My question was very generic, and I wanted to avoid posting the code because it is a bit intricate, but since someone asked here we go:
cdef dijkstra(Graph G, int start_idx, int end_idx):
# Some code
cdef np.ndarray[object, ndim=1] fiboheap_nodes = np.empty([G.num_nodes], dtype=object) # holds all of our FiboHeap Nodes Pointers
Q = FiboHeap()
fiboheap_nodes[start_idx] = Q.insert(0, start_idx)
# Some other code where it could perform operations like:
# Q.decrease_key(fiboheap_nodes[w], vw_distance)
# End of operations
# do we need to cleanup the fiboheap_nodes array here?
return
The FiboHeap
is a Cython wrapper for the c implementation. For example, the insert function looks like this:
cimport cfiboheap
from cpython.pycapsule cimport PyCapsule_New, PyCapsule_GetPointer
from python_ref cimport Py_INCREF, Py_DECREF
cdef inline object convert_fibheap_el_to_pycapsule(cfiboheap.fibheap_el* element):
return PyCapsule_New(element, NULL, NULL)
cdef class FiboHeap:
def __cinit__(FiboHeap self):
self.treeptr = cfiboheap.fh_makekeyheap()
if self.treeptr is NULL:
raise MemoryError()
def __dealloc__(FiboHeap self):
if self.treeptr is not NULL:
cfiboheap.fh_deleteheap(self.treeptr)
cpdef object insert(FiboHeap self, double key, object data=None):
Py_INCREF(data)
cdef cfiboheap.fibheap_el* retValue = cfiboheap.fh_insertkey(self.treeptr, key, <void*>data)
if retValue is NULL:
raise MemoryError()
return convert_fibheap_el_to_pycapsule(retValue)
The __dealloc__()
function works as it is supposed to, so the FiboHeap is released from the memory at the end of the function dijkstra(...)
. My guess is that something is going wrong with the pointers contained in fiboheap_nodes.
Any guess?