I want to implement Bag of Visual Words in MATLAB. First I read images from dataset directory and I detect SURF features and extract them using these two functions detectSURFFeatures
and extractFeatures
.
I store each feature into a cell array and finally I want to cluster them using the k-means algorithm but I can't fit this data into k-means function input. How can I insert SURF features into the k-means clustering algorithm in MATLAB?
Here is my sample code which reads image from files and extracts their SURF features.
clc;
clear;
close all;
folder = 'CarData/TrainImages/cars';
filePattern = fullfile(folder, '*.pgm');
f=dir(filePattern);
files={f.name};
for k=1:numel(files)
fullFileName = fullfile(folder, files{k});
image=imread(fullFileName);
temp = detectSURFFeatures(image);
[im_features, temp] = extractFeatures(image, temp);
features{k}= im_features;
end
[centers, assignments] = kmeans(double(features), 100);
kmeans
expects a N x P
matrix for the input data where N
is the total number of examples and P
is the total number of features. What you are doing incorrectly is placing each feature matrix into a cell array. What you have to do instead is to concatenate all of the features from all of the images into a single matrix.
The easiest way to do that would be to add the following code before your kmeans
call:
features = vertcat(features{:});
The function vertcat
will vertically stack matrices together given a list of matrices that all share the same number of columns. Doing features{:}
extracts out a comma-separated list so that it is equivalent to doing:
features = vertcat(features{1}, features{2}, ...);
The final effect is that this will vertically stack all of the SURF features from every single image together into a 2D matrix. You are using the default version of SURF, so each feature should be of length 64, so you should have 64 columns. The number of rows should be the total number of features detected over all images.
Therefore, to be absolutely clear:
clc;
clear;
close all;
folder = 'CarData/TrainImages/cars';
filePattern = fullfile(folder, '*.pgm');
f=dir(filePattern);
files={f.name};
for k=1:numel(files)
fullFileName = fullfile(folder, files{k});
image=imread(fullFileName);
temp = detectSURFFeatures(image);
[im_features, temp] = extractFeatures(image, temp);
features{k}= im_features;
end
% New code
features = vertcat(features{:});
% Resume old code
[centers, assignments] = kmeans(double(features), 100);