Simple (working) handwritten digit recognition: ho

2019-07-20 03:53发布

问题:

I just wrote this very simple handwritten digit recoginition. Here is 8kb archive with the following code + ten .PNG image files. It works: is well recognized as .

In short, each digit of the database (50x50 pixels = 250 coefficients) is summarized into a 10-coefficient-vector (by keeping the 10 biggest singular values, see Low-rank approximation with SVD).

Then for the digit to be recognized, we minimize the distance with the digits in the database.

from scipy import misc
import numpy as np
import matplotlib.pyplot as plt

digits = []
for i in range(11):
    M = misc.imread(str(i) + '.png', flatten=True)
    U, s, V = np.linalg.svd(M, full_matrices=False)
    s[10:] = 0        # keep the 10 biggest singular values only, discard others
    S = np.diag(s)
    M_reduced = np.dot(U, np.dot(S, V))      # reconstitution of image with 10 biggest singular values
    digits.append({'original': M, 'singular': s[:10], 'reduced': M_reduced})

# each 50x50 pixels digit is summarized into a vector of 10 coefficients : the 10 biggest singular values s[:10]    

# 0.png to 9.png = all the digits (for machine training)
# 10.png = the digit to be recognized
toberecognizeddigit = digits[10]    
digits = digits[:10]

# we find the nearest-neighbour by minimizing the distance between singular values of toberecoginzeddigit and all the digits in database
recognizeddigit = min(digits[:10], key=lambda d: sum((d['singular']-toberecognizeddigit['singular'])**2))    

plt.imshow(toberecognizeddigit['reduced'], interpolation='nearest', cmap=plt.cm.Greys_r)
plt.show()
plt.imshow(recognizeddigit['reduced'], interpolation='nearest', cmap=plt.cm.Greys_r)
plt.show()

Question:

The code works (you can run the code in the ZIP archive), but how can we improve it to have better results? (mostly math techniques I imagine).

For example in my tests, 9 and 3 are sometimes confused with each other.

回答1:

Digit recognition can be a quite difficult area. Especially when the digits are written in very different or unclear ways. A lot of approaches have been taken in an attempt to solve this problem, and entire competions are dedicated to this subject. For an example, see Kaggle's digit recognizer competition. This competition is based on the well known MNIST data set. In the forums that are there, you will find a lot of ideas and approaches to this problem, but I will give some quick suggestions.

A lot of people approach this problem as a classification problem. Possible algorithms to solve such problems include, for example, kNN, neural networks, or gradient boosting.

However, generally just the algorithm is not enough to get optimal classification rates. Another important aspect to improve your scores is feature extraction. The idea is to calculate features that make it possible to distinguish between different numbers. Some example features for this dataset might include the number of colored pixels, or maybe the width and the height of the digits.

Although the other algorithms might not be what you are looking for, it is possible that adding more features can improve the performance of the algorithm you are currently using as well.