Edit tensorflow inceptionV3 retraining-example.py

2019-04-20 05:23发布

问题:

TLDR: Cannot figure out how to use retrained inceptionV3 for multiple image predictions.

Hello kind people :) I've spent a few days searching many stackoverflow posts and the documentation, but I could not find an answer to this question. Would greatly appreciate any help on this!

I have retrained a tensorflow inceptionV3 model on new pictures, and it is able to work on new images by following the instructions at https://www.tensorflow.org/versions/r0.9/how_tos/image_retraining/index.html and using the following commands:

bazel build tensorflow/examples/label_image:label_image && \
bazel-bin/tensorflow/examples/label_image/label_image \
--graph=/tmp/output_graph.pb --labels=/tmp/output_labels.txt \
--output_layer=final_result \
--image= IMAGE_DIRECTORY_TO_CLASSIFY

However, I need to classify multiple images (like a dataset), and am seriously stuck on how to do so. I've found the following example at

https://github.com/eldor4do/Tensorflow-Examples/blob/master/retraining-example.py

on how to use the retrained model, but again, it is greatly sparse on details on how to modify it for multiple classifications.

From what I've gathered from the MNIST tutorial, I need to input feed_dict in the sess.run() object, but was stuck there as I couldn't understand how to implement it in this context.

Any assistance will be extremely appreciated! :)

EDIT:

Running Styrke's script with some modifications, i got this

    waffle@waffleServer:~/git$ python tensorflowMassPred.py  I
       tensorflow/stream_executor/dso_loader.cc:108] successfully opened
       CUDA library libcublas.so locally I
       tensorflow/stream_executor/dso_loader.cc:108] successfully opened
       CUDA library libcudnn.so locally I
       tensorflow/stream_executor/dso_loader.cc:108] successfully opened
       CUDA library libcufft.so locally I
       tensorflow/stream_executor/dso_loader.cc:108] successfully opened
       CUDA library libcuda.so locally I
       tensorflow/stream_executor/dso_loader.cc:108] successfully opened
       CUDA library libcurand.so locally
       /home/waffle/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py:1197:
       VisibleDeprecationWarning: converting an array with ndim > 0 to an
       index will result in an error in the future  
       result_shape.insert(dim, 1) I
       tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:924] successful
       NUMA node read from SysFS had negative value (-1), but there must be
       at least one NUMA node, so returning NUMA node zero I
       tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0
       with properties:  name: GeForce GTX 660 major: 3 minor: 0
       memoryClockRate (GHz) 1.0975 pciBusID 0000:01:00.0 Total memory:
       2.00GiB Free memory: 1.78GiB I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0  I
       tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y  I
       tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating
       TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 660, pci
       bus id: 0000:01:00.0) W tensorflow/core/framework/op_def_util.cc:332]
       Op BatchNormWithGlobalNormalization is deprecated. It will cease to
       work in GraphDef version 9. Use tf.nn.batch_normalization(). E
       tensorflow/core/common_runtime/executor.cc:334] Executor failed to
       create kernel. Invalid argument: NodeDef mentions attr 'T' not in
       Op<name=MaxPool; signature=input:float -> output:float;
       attr=ksize:list(int),min=4; attr=strides:list(int),min=4;
       attr=padding:string,allowed=["SAME", "VALID"];
       attr=data_format:string,default="NHWC",allowed=["NHWC", "NCHW"]>;
       NodeDef: pool = MaxPool[T=DT_FLOAT, data_format="NHWC", ksize=[1, 3,
       3, 1], padding="VALID", strides=[1, 2, 2, 1],
       _device="/job:localhost/replica:0/task:0/gpu:0"](pool/control_dependency)
         [[Node: pool = MaxPool[T=DT_FLOAT, data_format="NHWC", ksize=[1, 3,
       3, 1], padding="VALID", strides=[1, 2, 2, 1],
       _device="/job:localhost/replica:0/task:0/gpu:0"](pool/control_dependency)]]
       Traceback (most recent call last):   File
       "/home/waffle/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py",
       line 715, in _do_call
           return fn(*args)   File "/home/waffle/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py",
       line 697, in _run_fn
           status, run_metadata)   File "/home/waffle/anaconda3/lib/python3.5/contextlib.py", line 66, in
       __exit__
           next(self.gen)   File "/home/waffle/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/errors.py",
       line 450, in raise_exception_on_not_ok_status
           pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors.InvalidArgumentError: NodeDef
       mentions attr 'T' not in Op<name=MaxPool; signature=input:float ->
       output:float; attr=ksize:list(int),min=4;
       attr=strides:list(int),min=4; attr=padding:string,allowed=["SAME",
       "VALID"]; attr=data_format:string,default="NHWC",allowed=["NHWC",
       "NCHW"]>; NodeDef: pool = MaxPool[T=DT_FLOAT, data_format="NHWC",
       ksize=[1, 3, 3, 1], padding="VALID", strides=[1, 2, 2, 1],
       _device="/job:localhost/replica:0/task:0/gpu:0"](pool/control_dependency)
         [[Node: pool = MaxPool[T=DT_FLOAT, data_format="NHWC", ksize=[1, 3,
       3, 1], padding="VALID", strides=[1, 2, 2, 1],
       _device="/job:localhost/replica:0/task:0/gpu:0"](pool/control_dependency)]]

       During handling of the above exception, another exception occurred:

       Traceback (most recent call last):   File "tensorflowMassPred.py",
       line 116, in <module>
           run_inference_on_image()   File "tensorflowMassPred.py", line 98, in run_inference_on_image
           {'DecodeJpeg/contents:0': image_data})   File "/home/waffle/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py",
       line 372, in run
           run_metadata_ptr)   File "/home/waffle/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py",
       line 636, in _run
           feed_dict_string, options, run_metadata)   File "/home/waffle/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py",
       line 708, in _do_run
           target_list, options, run_metadata)   File "/home/waffle/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py",
       line 728, in _do_call
           raise type(e)(node_def, op, message) tensorflow.python.framework.errors.InvalidArgumentError: NodeDef
       mentions attr 'T' not in Op<name=MaxPool; signature=input:float ->
       output:float; attr=ksize:list(int),min=4;
       attr=strides:list(int),min=4; attr=padding:string,allowed=["SAME",
       "VALID"]; attr=data_format:string,default="NHWC",allowed=["NHWC",
       "NCHW"]>; NodeDef: pool = MaxPool[T=DT_FLOAT, data_format="NHWC",
       ksize=[1, 3, 3, 1], padding="VALID", strides=[1, 2, 2, 1],
       _device="/job:localhost/replica:0/task:0/gpu:0"](pool/control_dependency)
         [[Node: pool = MaxPool[T=DT_FLOAT, data_format="NHWC", ksize=[1, 3,
       3, 1], padding="VALID", strides=[1, 2, 2, 1],
       _device="/job:localhost/replica:0/task:0/gpu:0"](pool/control_dependency)]]
       Caused by op 'pool', defined at:   File "tensorflowMassPred.py", line
       116, in <module>
           run_inference_on_image()   File "tensorflowMassPred.py", line 87, in run_inference_on_image
           create_graph()   File "tensorflowMassPred.py", line 68, in create_graph
           _ = tf.import_graph_def(graph_def, name='')   File "/home/waffle/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/importer.py",
       line 274, in import_graph_def
           op_def=op_def)   File "/home/waffle/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py",
       line 2260, in create_op
           original_op=self._default_original_op, op_def=op_def)   File "/home/waffle/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py",
       line 1230, in __init__
           self._traceback = _extract_stack()

This is the script: some functions are removed.

import os
import numpy as np
import tensorflow as tf
os.chdir('tensorflow/') #if need to run in the tensorflow directory
import csv,os
import pandas as pd
import glob

imagePath = '../_images_processed/test'
modelFullPath = '/tmp/output_graph.pb'
labelsFullPath = '/tmp/output_labels.txt'

# FILE NAME TO SAVE TO.
SAVE_TO_CSV = 'tensorflowPred.csv'


def makeCSV():
    global SAVE_TO_CSV
    with open(SAVE_TO_CSV,'w') as f:
        writer = csv.writer(f)
        writer.writerow(['id','label'])


def makeUniqueDic():
    global SAVE_TO_CSV
    df = pd.read_csv(SAVE_TO_CSV)
    doneID = df['id']
    unique = doneID.unique()
    uniqueDic = {str(key):'' for key in unique} #for faster lookup
    return uniqueDic


def create_graph():
    """Creates a graph from saved GraphDef file and returns a saver."""
    # Creates graph from saved graph_def.pb.
    with tf.gfile.FastGFile(modelFullPath, 'rb') as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
        _ = tf.import_graph_def(graph_def, name='')


def run_inference_on_image():
    answer = []
    global imagePath
    if not tf.gfile.IsDirectory(imagePath):
        tf.logging.fatal('imagePath directory does not exist %s', imagePath)
        return answer

    if not os.path.exists(SAVE_TO_CSV):
        makeCSV()

    files = glob.glob(imagePath+'/*.jpg')
    uniqueDic = makeUniqueDic()        
    # Get a list of all files in imagePath directory
    #image_list = tf.gfile.ListDirectory(imagePath)

    # Creates graph from saved GraphDef.
    create_graph()

    with tf.Session() as sess:

        softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')

        for pic in files:
            name = getNamePicture(pic)
            if name not in uniqueDic:
                image_data = tf.gfile.FastGFile(pic, 'rb').read()
                predictions = sess.run(softmax_tensor,
                                   {'DecodeJpeg/contents:0': image_data})
                predictions = np.squeeze(predictions)

                top_k = predictions.argsort()[-5:][::-1]  # Getting top 5 predictions
                f = open(labelsFullPath, 'rb')
                lines = f.readlines()
                labels = [str(w).replace("\n", "") for w in lines]
#            for node_id in top_k:
#                human_string = labels[node_id]
#                score = predictions[node_id]
#                print('%s (score = %.5f)' % (human_string, score))
                pred = labels[top_k[0]]
                with open(SAVE_TO_CSV,'a') as f:
                    writer = csv.writer(f)
                    writer.writerow([name,pred])
    return answer

if __name__ == '__main__':
    run_inference_on_image()

回答1:

The raw jpeg data seems to be fed directly to a decode_jpeg operation, which only takes a single image as input at a time. In order to process more than one image at a time you would probably need to define more decode_jpeg ops. If it is possible to do that then I don't currently know how.

The next best thing, which is easy, is probably to classify all the images one by one inside with a loop the TensorFlow session. This way you will at least avoid reloading the graph and starting a new TF session for every image that you want to classify, both of which can take quite a bit of time if you have to do it a lot.

Here I have changed the definition of the run_inference_on_image() function so it should classify all images in the directory that is specified by the imagePath variable. I have not tested this code, so there may be minor problems that need to be fixed.

def run_inference_on_image():
    answer = []

    if not tf.gfile.IsDirectory(imagePath):
        tf.logging.fatal('imagePath directory does not exist %s', imagePath)
        return answer

    # Get a list of all files in imagePath directory
    image_list = tf.gfile.ListDirectory(imagePath)

    # Creates graph from saved GraphDef.
    create_graph()

    with tf.Session() as sess:

        softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')

        for i in image_list:
            image_data = tf.gfile.FastGFile(i, 'rb').read()
            predictions = sess.run(softmax_tensor,
                                   {'DecodeJpeg/contents:0': image_data})
            predictions = np.squeeze(predictions)

            top_k = predictions.argsort()[-5:][::-1]  # Getting top 5 predictions
            f = open(labelsFullPath, 'rb')
            lines = f.readlines()
            labels = [str(w).replace("\n", "") for w in lines]
            for node_id in top_k:
                human_string = labels[node_id]
                score = predictions[node_id]
                print('%s (score = %.5f)' % (human_string, score))

            answer.append(labels[top_k[0]])
    return answer


回答2:

So looking at your linked script:

with tf.Session() as sess:

    softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
    predictions = sess.run(softmax_tensor,
                           {'DecodeJpeg/contents:0': image_data})
    predictions = np.squeeze(predictions)

    top_k = predictions.argsort()[-5:][::-1] # Getting top 5 predictions

Within this snippet, image_data is the new image that you want to feed to the model, that's loaded a few lines previously:

image_data = tf.gfile.FastGFile(imagePath, 'rb').read()

So my instinct would be to change the run_inference_on_image to accept imagePath as a parameter, and use os.listdir and os.path.join to do that on each image in your dataset.



回答3:

I had the same issues. I followed all the possible solutions and finally found one that worked for me. This error occurs when the version of Tensorflow used to re-train the model is different from the one where it is being used.

The solution is to update Tensorflow to the latest version. Since I had used pip to install Tensorflow, I only had to run the following command :

sudo pip install tensorflow --upgrade 

And it worked perfectly.