High training accuracy but low prediction performa

I'm new to machine learning and I was following along with the Tensorflow official MNIST model (https://github.com/tensorflow/models/tree/master/official/mnist). After training the model for 3 epochs and getting accuracy results of over 98%, I decided to test the dataset with some of my own handwritten images that are very close to those found in the MNIST dataset:

    {'loss': 0.03686057, 'global_step': 2400, 'accuracy': 0.98729998}

handwritten 1, predicted as 2: https://storage.googleapis.com/imageexamples/example1.png

handwritten 4, predicted as 5: https://storage.googleapis.com/imageexamples/example4.png

handwritten 7, predicted correctly as 7: https://storage.googleapis.com/imageexamples/example7.png

However, as you can see below, the predictions were mostly incorrect. Can anyone share some insight as to why this might be? If you want any other info, please let me know. Thanks!

[2 5 7]
Result for output key probabilities:
[[  1.47042423e-01   1.40417784e-01   2.80471593e-01   1.18162427e-02
    1.71029475e-02   1.15245730e-01   9.41787264e-04   1.71402004e-02
    2.61987478e-01   7.83374347e-03]
 [  3.70134876e-05   3.59491096e-03   1.70885725e-03   3.44008535e-01
    1.75098982e-02   6.24581575e-01   1.02930271e-05   3.97418407e-05
    7.59732258e-03   9.11886105e-04]
 [  7.62941269e-03   7.74145573e-02   1.42017215e-01   4.73754480e-03
    3.75231934e-06   7.16139004e-03   4.40478354e-04   7.60131121e-01
    4.09408152e-04   5.51677040e-05]]

Here is the script I used to convert pngs into npy arrays for testing. The resulting array for the provided '3' and '5' image is identical to the one given in the TF repository, so I don't think it's the issue:

def main(unused_argv):

output = []
images = []

filename_generate = True
index = 0

if FLAGS.images is not None:
    images = str.split(FLAGS.images)
if FLAGS.output is not "": # check for output names and make sure outputs map to images
    output = str.split(FLAGS.output)
    filename_generate = False
    if len(output) != len(images):
        raise ValueError('The number of image files and output files must be the same.')

if FLAGS.batch == "True":
    combined_arr = np.array([]) # we'll be adding up arrays

for image_name in images:
    input_image = Image.open(image_name).convert('L') # convert to grayscale
    input_image = input_image.resize((28, 28)) # resize the image, if needed
    width, height = input_image.size
    data_image = array('B')
    pixel = input_image.load()
    for x in range(0,width):
        for y in range(0,height):
            data_image.append(pixel[y,x]) # use the MNIST format
    np_image = np.array(data_image)
    img_arr = np.reshape(np_image, (1, 28, 28))
    img_arr = img_arr/float(255) # use scale of [0, 1]
    if FLAGS.batch != "True":
        if filename_generate:
            np.save("image"+str(index), img_arr) # save each image with random filenames
        else:
            np.save(output[index], img_arr) # save each image with chosen filenames
        index = index+1
    else:
        if combined_arr.size == 0:
            combined_arr = img_arr
        else:
            combined_arr = np.concatenate((combined_arr, img_arr), axis=0) # add all image arrays to one array
if FLAGS.batch == "True":
    if filename_generate:
        np.save("images"+str(index), combined_arr) # save batched images with random filename
    else:
        np.save(output[0], combined_arr) # save batched images with chosen filename

I haven't changed anything in the official MNIST model except the number of epochs (previously 40, changed because it was taking so long to train and already seeing high accuracy after 1 epoch).

Thanks so much!

回答1:

The MNIST images are white-on-black; the images you've linked are black-on-white.

Unless there's a conversion step I missed, you'll want to invert the colors before attempting detection.

回答2:

The MNIST images pixel value range from 0.00 to 1.00 from black to white. Usually when you use your own image it would be 0 to 255. I'm not sure if I've missed any coding that map it back down to 0.0-1.0 .Can you try printing out the array that is storing your pixel value and make sure white pixels are close to or equal to 1.0 and black pixels are close to or equal to 0.0.