I am using PCA on 100 images. My training data is 442368x100 double
matrix. 442368 are features and 100 is number of images. Here is my code for finding the eigenvector.
[ rows, cols] = size(training);
maxVec=rows;
maxVec=min(maxVec,rows);
train_mean=mean(training,2);
A=training-train_mean*ones(1,cols);
A=A'*A;
[evec,eval]=eig(A);
[eval ind] = sort(-1*diag(eval));
evec= evec(:, ind(1:100));
Now evec
is an eigenvector matrix of order of 100x100 double and now I have got 100 eigenvectors sorted.
Questions:
Now, if I want to transform my testing data using above calculated eigenvectors then how can I use these eigenvectors? My testing data is 442368x50 double
but my eigenvector matrix is 100x100 double
. The inner matrix dimensions don't agree. How can I find the dot product of my testing data and eigenvector matrix?
What you are doing is essentially dimensionality reduction. You currently have the top 100 eigenvectors that determine the basis vectors that retain the largest variance in your data. What you want to do now is to project your test data onto these same basis vectors. BTW, you do have an error with your covariance matrix calculation. This is performed on a per feature basis but you are performing this on a per image basis.... so that's not correct. You have to swap the order of the transpose in your calculation. You also must divide by the total number of examples minus 1 to complete the calculation and produce an unbiased estimator:
A = (1/(cols-1))*(A*A.');
Transposing A
first then multiplying assumes that each column is a feature but this is not the case for you. If you recall from dimensionality reduction, we currently have a matrix of eigenvectors where each column is an eigenvector. If you want to finally perform the reduction, it is simply a multiplication of the data matrix that is mean subtracted with the eigenvector matrix. It is important to note that the order of the eigenvectors in this matrix is such that the basis vector encompassing the largest variance that can be explained by your data appears first. That is why the sorting is performed on the eigenvalues as the eigenvector with the largest eigenvalue embodies this property. However, this operation assumes that each column is a feature and your data matrix is such that each row is a feature. If you want to perform reconstruction on your original training data, you'll need to transpose the mean subtracted data before doing this multiplication. However, this will make each example in a row. From your code, each column is an example and so you can transpose the eigenvector matrix instead:
% Assuming you did (1/(cols-1))*(A*A.') to compute the eigenvectors
Atrain = training - train_mean*ones(1, cols);
Areconstruct = evec.' * Atrain;
Areconstruct
will contain the reconstructed data where each column is corresponding reprojected example. I also needed to store the mean subtracted feature matrix as your code overwrites this with the covariance matrix. If you want to perform this reprojection on your test data, you must mean subtract with the features computed from your training data, then apply the multiplication above. Assuming your data is stored in test_data
, simply do the following:
cols_test = size(test_data, 2);
B = test_data - train_mean*ones(1, cols_test);
Breconstruct = evec.' * B;
Breconstruct
contains the reprojected data onto the basis vectors which will now be a 100 x 50
matrix where each column is a reprojected example from your test data.
A word of warning
This code will probably run very slow or worst case not run at all as your covariance matrix is quite large in size. It is highly advisable that you reduce the total number of features a priori as much as possible before trying dimensionality reduction. As you have stated in your comments, each example is simply an unrolled version of an image as a long vector so try resizing the image down to something manageable. In addition, it is usually customary to low-pass filter (Gaussian blur for example) the resized image prior to using as it prevents aliasing.
Also, check out the recommendation I have for using Singular Value Decomposition later on in this post. It should be faster than using the eigenvectors of the covariance matrix.
Can you make your code more efficient?
I would improve on this code by using bsxfun
and also you can use sort
with the descend
flag so you don't have to multiply your values by -1 prior to sorting to get the indices in descending order. bsxfun
allows you to efficiently mean subtract your features without performing the duplication of having the mean of each feature be repeated for as many examples that you have (i.e. with ones(1, cols)
).
Specifically:
[ rows, cols] = size(training);
maxVec=rows;
maxVec=min(maxVec,rows);
train_mean=mean(training,2);
A = bsxfun(@minus, training, train_mean); % Change
%A=training-train_mean*ones(1,cols);
Acov = (1/(cols-1))*(A*A.'); % Change - correct formula
[evec,eval]=eig(Acov);
%[eval ind] = sort(-1*diag(eval));
[eval, ind] = sort(diag(eval), 'descend'); % Change
evec= evec(:, ind(1:100));
Finally for your test data:
B = bsxfun(@minus, test_data, train_mean);
Breconstruct = evec.' * B;
A word of advice - Use SVD
Using the eigenvectors for dimensionality reduction is known to be unstable - specifically when it comes to computing eigenvectors for high dimensional data such as what you have. It is advised that you use the Singular Value Decomposition (SVD) framework to do so instead. You can view this Cross Validated post on the relationship between the eigenvectors of the covariance matrix and using SVD for performing PCA:
https://stats.stackexchange.com/questions/134282/relationship-between-svd-and-pca-how-to-use-svd-to-perform-pca
As such, compute the SVD on the covariance matrix and the columns of V
are the eigenvectors you need to perform the computation. The added benefit with SVD is the eigenvectors are already ordered based on their variance and so the first column of V
would be the basis vector that points in the direction with the largest variance. As such, you don't need to do any sorting as you did with the eigenvectors.
Therefore, you would use this with the SVD:
Acov = (1/(cols-1))*(A*A.');
[U,S,V] = svd(Acov);
Areconstruct = V(:, 1:100).' * A;
For your testing data:
B = bsxfun(@minus, test_data, train_mean);
Breconstruct = V(:, 1:100).' * B;
Further Reading
You can take a look at my post on dimensionality reduction using eigenvectors and eigenvalues from the covariance matrix from my answer here: What does selecting the largest eigenvalues and eigenvectors in the covariance matrix mean in data analysis?
It also gives you a brief overview of why this operation is done to perform PCA or dimensionality reduction. However, I would highly advise you use SVD to do what you need. It is faster and more stable than using the eigenvectors of the covariance matrix.