Log likelihood function for GDA(Gaussian Discrimin

2019-09-11 04:47发布

问题:

I am having trouble understanding the likelihood function for GDA given in Andrew Ng's CS229 notes.

l(φ,µ0,µ1,Σ) = log (product from i to m) {p(x(i)|y(i);µ0,µ1,Σ)p(y(i);φ)}

The link is http://cs229.stanford.edu/notes/cs229-notes2.pdf Page 5.

For Linear regression the function was product from i to m p(y(i)|x(i);theta) which made sense to me. Why is there a change here saying it is given by p(x(i)|y(i) and that is multiplied by p(y(i);phi)? Thanks in advance

回答1:

The starting formula on page 5 is

l(φ,µ0,µ1,Σ) = log <product from i to m> p(x_i, y_i;µ0,µ1,Σ,φ)

leaving out the parameters φ,µ0,µ1,Σ for now, that can be simplified to

l = log <product> p(x_i, y_i)

using the chain rule you can convert that to either

l = log <product> p(x_i|y_i)p(y_i)

l = log <product> p(y_i|x_i)p(x_i).

In the page 5 formula, the φ is moved to p(y_i), because only p(y) depends on it.

The likelihood starts with the joint probability distribution p(x,y) instead of the conditional probability distribution p(y|x), which is why GDA is called a generative model (models from x to y and from y to x), while logistic regression is considered a discriminatory model (models from x to y, one-way). Both have their advantages and disadvantages. There seems to be a chapter about that further below.