I am using Principle Component Analysis on the features extracted from different layers of CNN. I have downloaded the toolbox of dimension reduction from here.
I have a total of 11232 training images and feature for each image is 6532. so the feature matrix is like that 11232x6532
If I want top 90% features I can easily do that and training accuracy using SVM of reduced data is 81.73% which is fair.
However, when I try the testing data which have 2408 images and features of each image is 6532. so feature matrix for testing data is 2408x6532
. In that case the output for top 90% feature is not correct it shows 2408x2408
.
and the testing accuracy is 25%.
Without using dimension reduction the training accuracy is 82.17% and testing accuracy is 79%.
Update:
Where X
is the data and no_dims
is required number of dimensions at output.
the output of this PCA function is variable mappedX
and structure mapping
.
% Make sure data is zero mean
mapping.mean = mean(X, 1);
X = bsxfun(@minus, X, mapping.mean);
% Compute covariance matrix
if size(X, 2) < size(X, 1)
C = cov(X);
else
C = (1 / size(X, 1)) * (X * X'); % if N>D, we better use this matrix for the eigendecomposition
end
% Perform eigendecomposition of C
C(isnan(C)) = 0;
C(isinf(C)) = 0;
[M, lambda] = eig(C);
% Sort eigenvectors in descending order
[lambda, ind] = sort(diag(lambda), 'descend');
if no_dims < 1
no_dims = find(cumsum(lambda ./ sum(lambda)) >= no_dims, 1, 'first');
disp(['Embedding into ' num2str(no_dims) ' dimensions.']);
end
if no_dims > size(M, 2)
no_dims = size(M, 2);
warning(['Target dimensionality reduced to ' num2str(no_dims) '.']);
end
M = M(:,ind(1:no_dims));
lambda = lambda(1:no_dims);
% Apply mapping on the data
if ~(size(X, 2) < size(X, 1))
M = bsxfun(@times, X' * M, (1 ./ sqrt(size(X, 1) .* lambda))'); % normalize in order to get eigenvectors of covariance matrix
end
mappedX = X * M;
% Store information for out-of-sample extension
mapping.M = M;
mapping.lambda = lambda;
Based on your suggestion. I have calculated the vector for the training data.
numberOfDimensions = round(0.9*size(Feature,2));
[mapped_data, mapping] = compute_mapping(Feature, 'PCA', numberOfDimensions);
Then using same vector for testing data:
mappedX_test = Feature_test * mapping.M;
Still the accuracy is 32%
Solved by doing subtraction:
Y = bsxfun(@minus, Feature_test, mapping.mean);
mappedX_test = Y * mapping.M;