I am having issues in reconciling some basic theoretical results on Gaussian mixtures and the output of the commands gmdistribution, random
in Matlab.
Consider a mixture of two independent 3-variate normal distributions with weights 1/2,1/2
.
The first distribution A
is characterised by mean and variance-covariance matrix equal to
muA=[-1.4 3.2 -1.9]; %mean vector
rhoA=-0.5; %correlation among components in A
sigmaA=[1 rhoA rhoA; rhoA 1 rhoA; rhoA rhoA 1]; %variance-covariance matrix of A
The second distribution B
is characterised by mean and variance-covariance matrix equal to
muB=muB=[1.2 -1.6 1.5]; %mean vector
rhoB=0.3; %correlation among components in B
sigmaB=[1 rhoB rhoB; rhoB 1 rhoB; rhoB rhoB 1]; %variance-covariance matrix of B
Let epsilon
be the 3-variate random vector distributed as the mixture. My calculations suggest that the expected value of epsilon
should be
Mtheory=1/2*(muA+muB);
and the variance-covariance matrix should be
Vtheory=1/4*[2 rhoA+rhoB rhoA+rhoB; rhoA+rhoB 2 rhoA+rhoB; rhoA+rhoB rhoA+rhoB 2];
Let's now try to see whether Mtheory
and Vtheory
coincide with the empirical moments that we get by drawing many random numbers from the mixture.
clear
rng default
n=10^6; %number of draws
w = ones(1,2)/2; %weights
rhoA=-0.5; %correlation among components of A
rhoB=0.3; %correlation among components of B
muA=[-1.4 3.2 -1.9]; %mean vector of A
muB=[1.2 -1.6 1.5]; %mean vector of B
mu = [muA;muB];
%Variance-covariance matrix for mixing
sigmaA=[1 rhoA rhoA; rhoA 1 rhoA; rhoA rhoA 1]; %variance-covariance matrix of A
sigmaB=[1 rhoB rhoB; rhoB 1 rhoB; rhoB rhoB 1]; %variance-covariance matrix of B
sigma = cat(3,sigmaA,sigmaB);
obj = gmdistribution(mu, sigma,w);
%Draws
epsilon = random(obj, n);
M=mean(epsilon);
V=cov(epsilon);
Mtheory=1/2*(muA+muB);
Vtheory=1/4*[2 rhoA+rhoB rhoA+rhoB; rhoA+rhoB 2 rhoA+rhoB; rhoA+rhoB rhoA+rhoB 2];
Question: M
and Mtheory
almost coincide. V
and Vtheory
are completely different. What am I doing wrong? I should be doing something very silly but I don't see where.
When you calculate the Covariance pay attention that your data isn't centered.
Moreover, your 0.25 factor is wrong.
This is not a scaling of the variable but a selection.
The calculation should be done using the Law of Total Variance / Law of Total Covariance.
Where the "The Given Event" is the mixture index.
An example of the calculation is given by Calculation of the Covariance of Gaussian Mixtures.