-->

Bayesian Correlation with PyMC3

2019-05-17 17:37发布

问题:

I'm trying to convert this example of Bayesian correlation for PyMC2 to PyMC3, but get completely different results. Most importantly, the mean of the multivariate Normal distribution quickly goes to zero, whereas it should be around 400 (as it is for PyMC2). Consequently, the estimated correlation quickly goes towards 1, which is wrong as well.

The full code is available in this notebook for PyMC2 and in this notebook for PyMC3.

The relevant code for PyMC2 is

def analyze(data):
    # priors might be adapted here to be less flat
    mu = pymc.Normal('mu', 0, 0.000001, size=2)
    sigma = pymc.Uniform('sigma', 0, 1000, size=2)
    rho = pymc.Uniform('r', -1, 1)

    @pymc.deterministic
    def precision(sigma=sigma,rho=rho):
        ss1 = float(sigma[0] * sigma[0])
        ss2 = float(sigma[1] * sigma[1])
        rss = float(rho * sigma[0] * sigma[1])
        return np.linalg.inv(np.mat([[ss1, rss], [rss, ss2]]))

    mult_n = pymc.MvNormal('mult_n', mu=mu, tau=precision, value=data.T, observed=True)

    model = pymc.MCMC(locals()) 
    model.sample(50000,25000) 

My port of the above code to PyMC3 is as follows:

def precision(sigma, rho):
    C = T.alloc(rho, 2, 2)
    C = T.fill_diagonal(C, 1.)
    S = T.diag(sigma)
    return T.nlinalg.matrix_inverse(T.nlinalg.matrix_dot(S, C, S))


def analyze(data):
    with pm.Model() as model:
        # priors might be adapted here to be less flat
        mu = pm.Normal('mu', mu=0., sd=0.000001, shape=2, testval=np.mean(data, axis=1))
        sigma = pm.Uniform('sigma', lower=1e-6, upper=1000., shape=2, testval=np.std(data, axis=1))
        rho = pm.Uniform('r', lower=-1., upper=1., testval=0)

        prec = pm.Deterministic('prec', precision(sigma, rho))
        mult_n = pm.MvNormal('mult_n', mu=mu, tau=prec, observed=data.T)

    return model

model = analyze(data)
with model:
    trace = pm.sample(50000, tune=25000, step=pm.Metropolis())

The PyMC3 version runs, but clearly does not return the expected result. Any help would be highly appreciated.

回答1:

The call signature of pymc.Normal is

In [125]: pymc.Normal?
Init signature: pymc.Normal(self, *args, **kwds)
Docstring:
N = Normal(name, mu, tau, value=None, observed=False, size=1, trace=True, rseed=True, doc=None, verbose=-1, debug=False)

Notice that the third positional argument of pymc.Normal is tau, not the standard deviation, sd.

Therefore, since the pymc code uses

mu = Normal('mu', 0, 0.000001, size=2)

The corresponding pymc3 code should use

mu = pm.Normal('mu', mu=0., tau=0.000001, shape=2, ...)

or

mu = pm.Normal('mu', mu=0., sd=math.sqrt(1/0.000001), shape=2, ...)

since tau = 1/sigma**2.


With this one change, your pymc3 code produces (something like)