I'm using HoG features for object detection via classification.
I'm confused about how to deal with HoG feature vectors of different lengths.
I've trained my classifier using training images that all have the same size.
Now, I'm extracting regions from my image on which to run the classifier - say, using the sliding windows approach. Some of the windows that I extract are a lot bigger than the size of images the classifier was trained on. (It was trained on the smallest possible size of the object that might be expected in test images).
The problem is, when the windows I need to classify are bigger than the training image sizes, then the HoG feature vector is also much bigger than the trained model's feature vector.
So how can I use the model's feature vector to classify the extract window?
For example, let's take the dimensions of one extracted window, which is 360x240, and call it extractedwindow
. Then let's take one of my training images, which is only 20x30, and call it trainingsample
.
If I take the HoG feature vectors, like this:
fd1, hog_image1 = hog(extractedwindow, orientations=8, pixels_per_cell=(16, 16), cells_per_block=(1, 1), visualise=True, normalise=True)
fd2, hog_image2 = hog(trainingsample, orientations=8, pixels_per_cell=(16, 16), cells_per_block=(1, 1), visualise=True, normalise=True)
print len(fd1)
print len(fd2)
Then this is the difference in length between the feature vectors:
2640
616
So how is this dealt with? Are extracted windows supposed to be scaled down to the size of the samples the classifier was trained on? Or should the parameters for HoG features be changed/normalized according to each extracted window? Or is there another way to do this?
I'm personally working in python, using scikit-image, but I guess the problem is independent of what platform I'm using.
As you say, HOG basically uses a parameter that establishes the cell size in pixels. So if the image size changes, then the number of cells is different and then the descriptor is different in size.
The main approach is to use HOG is to use windows with the same size in pixels (the same size during training and also during testing). So
extracted window
should be the same size oftrainingsample
.In that reference, one user says:
So you should use the same window size...