I've been following the examples here on setting up Python for OCR by training OpenCV using kNN classification. I followed the first example and generated a knn_data.npz
that stores the training data and the training labels for later. What I'm trying to do now is to recall that training data and apply it to an OpenCV image that has a single character inside of it:
# Load training data
trainingData = np.load('knn_data.npz')
train = trainingData['train']
trainLabels = trainingData['train_labels']
knn = cv2.KNearest()
knn.train(train, trainLabels)
letter = cv2.imread('letter.png')
letter = cv2.cvtColor(letter, cv2.COLOR_BGR2GRAY)
print letter.shape
letter = letter.reshape((1,100))
letter = np.float32(letter)
print letter.shape
ret, result, neighbors, dist = knn.find_nearest(letter, k=5)
print result
The 'letter.png'
image is a 10x10 image so it's perfect safe to resize and numpy successfully resizes the image to a 1-dimensional array of shape (1, 100). However, when I try to pass this into the knn.find_nearest(...)
function, I get an error that says to use float-point matrices:
OpenCV Error: Bad argument (Input samples must be floating-point matrix (<num_samples>x<var_count>)) in find_nearest, file /build/buildd/opencv-2.4.8+dfsg1/modules/ml/src/knearest.cpp, line 370
Traceback (most recent call last):
File "sudoku.py", line 103, in <module>
ret, result, neighbors, dist = knn.find_nearest(letter, k=5)
cv2.error: /build/buildd/opencv-2.4.8+dfsg1/modules/ml/src/knearest.cpp:370: error: (-5) Input samples must be floating-point matrix (<num_samples>x<var_count>) in function find_nearest
However, I reshaped my image so that it occupies a single row and converted it into a float so I'm not entirely sure why this error is coming up. Any suggestions?
I just realized why this is happening. For the kNN classification to work, the test data (or single letter in this case) needs to have the exact same number of features as the training data. In this case, my training data used 20x20 images so the row vector had a length of 400, but my letter is only 10x10.
I fixed this by scaling up my letter to 20x20 and flattening it into a row vector of size 400 (20^2).
This doesn't have to work by row vectors necessarily either. The test data can be formatted as a matrix exactly like the training data, where each row contains a sample, in this case a letter. Then
find_nearest
will return a matrix where each row corresponds to the test data.