Defining a custom PyMC distribution

2019-03-26 07:26发布

问题:

This is perhaps a silly question.

I'm trying to fit data to a very strange PDF using MCMC evaluation in PyMC. For this example I just want to figure out how to fit to a normal distribution where I manually input the normal PDF. My code is:

data = []; 
for count in range(1000): data.append(random.gauss(-200,15));

mean = mc.Uniform('mean', lower=min(data), upper=max(data))
std_dev = mc.Uniform('std_dev', lower=0, upper=50)

# @mc.potential
# def density(x = data, mu = mean, sigma = std_dev):
#   return (1./(sigma*np.sqrt(2*np.pi))*np.exp(-((x-mu)**2/(2*sigma**2))))

mc.Normal('process', mu=mean, tau=1./std_dev**2, value=data, observed=True)

model = mc.MCMC([mean,std_dev])
model.sample(iter=5000)

print "!"
print(model.stats()['mean']['mean'])
print(model.stats()['std_dev']['mean'])

The examples I've found all use something like mc.Normal, or mc.Poisson or whatnot, but I want to fit to the commented out density function.

Any help would be appreciated.

回答1:

An easy way is to use the stochastic decorator:

import pymc as mc
import numpy as np

data = np.random.normal(-200,15,size=1000)

mean = mc.Uniform('mean', lower=min(data), upper=max(data))
std_dev = mc.Uniform('std_dev', lower=0, upper=50)

@mc.stochastic(observed=True)
def custom_stochastic(value=data, mean=mean, std_dev=std_dev):
    return np.sum(-np.log(std_dev) - 0.5*np.log(2) - 
                  0.5*np.log(np.pi) - 
                  (value-mean)**2 / (2*(std_dev**2)))


model = mc.MCMC([mean,std_dev,custom_stochastic])
model.sample(iter=5000)

print "!"
print(model.stats()['mean']['mean'])
print(model.stats()['std_dev']['mean'])

Note that my custom_stochastic function returns the log likelihood, not the likelihood, and that it is the log likelihood for the entire sample.

There are a few other ways to create custom stochastic nodes. This doc gives more details, and this gist contains an example using pymc.Stochastic to create a node with a kernel density estimator.