Matlab Principal Component Analysis (eigenvalues o

I want to use the "princomp" function of Matlab but this function gives the eigenvalues in a sorted array. This way I can't find out to which column corresponds which eigenvalue. For Matlab,

m = [1,2,3;4,5,6;7,8,9];
[pc,score,latent] = princomp(m);

is the same as

m = [2,1,3;5,4,6;8,7,9];
[pc,score,latent] = princomp(m);

That is, swapping the first two columns does not change anything. The result (eigenvalues) in latent will be: (27,0,0) The information (which eigenvalue corresponds to which original (input) column) is lost. Is there a way to tell matlab to not to sort the eigenvalues?

标签： matlab linear-algebra pca eigenvalue

2条回答

地球回转人心会变

2楼-- · 2019-01-29 14:57

With PCA, each principle component returned will be a linear combination of the original columns/dimensions. Perhaps an example might clear up any misunderstanding you have.

Lets consider the Fisher-Iris dataset comprising of 150 instances and 4 dimensions, and apply PCA on the data. To make things easier to understand, I am first zero-centering the data before calling PCA function:

load fisheriris
X = bsxfun(@minus, meas, mean(meas));    %# so that mean(X) is the zero vector

[PC score latent] = princomp(X);

Lets look at the first returned principal component (1st column of PC matrix):

This is expressed as a linear combination of the original dimensions, i.e.:

PC1 =  0.36139*dim1 + -0.084523*dim2 + 0.85667*dim3 + 0.35829*dim4

Therefore to express the same data in the new coordinates system formed by the principal components, the new first dimension should be a linear combination of the original ones according to the above formula.

We can compute this simply as X*PC which is the exactly what is returned in the second output of PRINCOMP (score), to confirm this try:

>> all(all( abs(X*PC - score) < 1e-10 ))
    1

Finally the importance of each principal component can be determined by how much variance of the data it explains. This is returned by the third output of PRINCOMP (latent).

We can compute the PCA of the data ourselves without using PRINCOMP:

[V E] = eig( cov(X) );
[E order] = sort(diag(E), 'descend');
V = V(:,order);

the eigenvectors of the covariance matrix V are the principal components (same as PC above, although the sign can be inverted), and the corresponding eigenvalues E represent the amount of variance explained (same as latent). Note that it is customary to sort the principal component by their eigenvalues. And as before, to express the data in the new coordinates, we simply compute X*V (should be the same as score above, if you make sure to match the signs)

0人赞添加讨论(0) 举报

干净又极端

3楼-- · 2019-01-29 15:06

"The information (which eigenvalue corresponds to which original (input) column) is lost."

Since each principal component is a linear function of all input variables, each principal component (eigenvector, eigenvalue), corresponds to all of the original input columns. Ignoring possible changes in sign, which are arbitrary in PCA, re-ordering the input variables about will not change the PCA results.

"Is there a way to tell matlab to not to sort the eigenvalues?"

I doubt it: PCA (and eigen analysis in general) conventionally sorts the results by variance, though I'd note that princomp() sorts from greatest to least variance, while eig() sorts in the opposite direction.

For more explanation of PCA using MATLAB illustrations, with or without princomp(), see:

Principal Components Analysis

0人赞添加讨论(0) 举报

Matlab Principal Component Analysis (eigenvalues o

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间