Pass in OpenCV image to KNearest's find_neares

2019-05-23 09:36发布

问题:

I've been following the examples here on setting up Python for OCR by training OpenCV using kNN classification. I followed the first example and generated a knn_data.npz that stores the training data and the training labels for later. What I'm trying to do now is to recall that training data and apply it to an OpenCV image that has a single character inside of it:

# Load training data
trainingData = np.load('knn_data.npz')
train = trainingData['train'] 
trainLabels = trainingData['train_labels']

knn = cv2.KNearest()
knn.train(train, trainLabels)

letter = cv2.imread('letter.png')
letter = cv2.cvtColor(letter, cv2.COLOR_BGR2GRAY)
print letter.shape
letter = letter.reshape((1,100))
letter = np.float32(letter)
print letter.shape

ret, result, neighbors, dist = knn.find_nearest(letter, k=5)
print result

The 'letter.png' image is a 10x10 image so it's perfect safe to resize and numpy successfully resizes the image to a 1-dimensional array of shape (1, 100). However, when I try to pass this into the knn.find_nearest(...) function, I get an error that says to use float-point matrices:

OpenCV Error: Bad argument (Input samples must be floating-point matrix (<num_samples>x<var_count>)) in find_nearest, file /build/buildd/opencv-2.4.8+dfsg1/modules/ml/src/knearest.cpp, line 370
Traceback (most recent call last):
  File "sudoku.py", line 103, in <module>
    ret, result, neighbors, dist = knn.find_nearest(letter, k=5)
cv2.error: /build/buildd/opencv-2.4.8+dfsg1/modules/ml/src/knearest.cpp:370: error: (-5) Input samples must be floating-point matrix (<num_samples>x<var_count>) in function find_nearest

However, I reshaped my image so that it occupies a single row and converted it into a float so I'm not entirely sure why this error is coming up. Any suggestions?

回答1:

I just realized why this is happening. For the kNN classification to work, the test data (or single letter in this case) needs to have the exact same number of features as the training data. In this case, my training data used 20x20 images so the row vector had a length of 400, but my letter is only 10x10.

I fixed this by scaling up my letter to 20x20 and flattening it into a row vector of size 400 (20^2).

This doesn't have to work by row vectors necessarily either. The test data can be formatted as a matrix exactly like the training data, where each row contains a sample, in this case a letter. Then find_nearest will return a matrix where each row corresponds to the test data.