How to fix memory leak in scipy.stats when generat

2019-06-05 21:46发布

问题:

The following minimal example appears to suffer from memory leak (tested using SciPy verison 0.17.0)

import resource
from scipy.stats import rv_continuous

class Rv(rv_continuous):

    def __init__(self, x):
        rv_continuous.__init__(self, a=0, b=1)
        self.x = x

    def _pdf(self, y):
        return 1


def call_rv(x):
    rv = Rv(x)
    # if the line below is commented out, memory usage stays constant
    s = rv.rvs()

    return 1

for k in range(10000):
    x = call_rv(k)
    if k%1000==0:
        mem = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
        print 'Memory usage: %s (kb)' % mem

I don't understand what causes the leak in my example. Notably, when the random variate generation s = rv.rvs() is commented out, the leak doesn't occur.

How can the memory leak be avoided when using rv_continuous and random variate generation?

回答1:

This is not a memory leak, the memory is going to be returned to the OS, eventually.

rv = Rv(x)

creates a new instance in the loop. Don't do that, and your memory consumption will be in check. If you want to generate N variates, create the instance once and then do .rvs(size=N).