In order to calculate the CDF of a multivariate normal, I followed this example (for the univariate case) but cannot interpret the output produced by scipy:
from scipy.stats import norm
import numpy as np
mean = np.array([1,5])
covariance = np.matrix([[1, 0.3 ],[0.3, 1]])
distribution = norm(loc=mean,scale = covariance)
print distribution.cdf(np.array([2,4]))
The output produced is:
[[ 8.41344746e-01 4.29060333e-04]
[ 9.99570940e-01 1.58655254e-01]]
If the joint CDF is defined as:
P (X1 ≤ x1, . . . ,Xn ≤ xn)
then the expected output should be a real number between 0 and 1.
After searching a lot, I think this blog entry by Noah H. Silbert describes the only readymade code from a standard library that can be used for computing the cdf for a multivariate normal in Python. Scipy has a way to do it but as mentioned in the blog, it is difficult to find. The approach is based on a paper by Alan Genz’s.
From the blog, this is how it works.
If you don't care about performance (i.e. perform it only occasionally), then you can create the multivariate normal pdf using
multivariate_normal
, and then calculate the cdf byintegrate.nquad
The scipy
multivariate_normal
from v1.1.0 has a cdf function built in now:Documentation for v1.4.1 can be found here.