Locally load saved tensorflow model .pb from googl

I'd like to take the tensorflow model i've trained online and run it locally with a python program I distribute.

After training, I get a directory /model with two files /saved_model.pb and a folder /variables. What is the simplest way to deploy this locally?

I was following here for deploying frozen models, but I can't quite read in the .pb. I downloaded saved_model.pb to my working directly and tried

with tf.gfile.GFile("saved_model.pb", "rb") as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())

google.protobuf.message.DecodeError: Truncated message.

Looking on SO here, they suggested a different route.

with tf.gfile.GFile("saved_model.pb", "rb") as f:
    proto_b=f.read()
    graph_def = tf.GraphDef()
    text_format.Merge(proto_b, graph_def) 

builtins.TypeError: a bytes-like object is required, not 'str'

I find this confusing since

type(proto_b)
<class 'bytes'>
type(graph_def)
<class 'tensorflow.core.framework.graph_pb2.GraphDef'>

Why the error, neither are strings?

What's the best way to deploy a cloud trained model?

Full code

import tensorflow as tf
import sys
from google.protobuf import text_format


# change this as you see fit
#image_path = sys.argv[1]
image_path="test.jpg"

# Read in the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()

# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line 
               in tf.gfile.GFile("dict.txt")]

# Unpersists graph from file
with tf.gfile.GFile("saved_model.pb", "rb") as f:
    proto_b=f.read()
    graph_def = tf.GraphDef()
    text_format.Merge(proto_b, graph_def) 

with tf.Session() as sess:
    # Feed the image_data as input to the graph and get first prediction
    softmax_tensor = sess.graph.get_tensor_by_name('conv1/weights:0')

    predictions = sess.run(softmax_tensor, \
                           {'DecodeJpeg/contents:0': image_data})

    # Sort to show labels of first prediction in order of confidence
    top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]

    for node_id in top_k:
        human_string = label_lines[node_id]
        score = predictions[0][node_id]
        print('%s (score = %.5f)' % (human_string, score))

The format of the model you deployed to the CloudML Engine service is a SavedModel. Loading a SavedModel in Python is fairly simple using the loader module:

import tensorflow as tf

with tf.Session(graph=tf.Graph()) as sess:
   tf.saved_model.loader.load(
       sess,
       [tf.saved_model.tag_constants.SERVING],
       path_to_model)

To perform inference, you're code is almost correct; you will need to make sure that you are feeding a batch to session.run, so just wrap image_data in a list:

# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('conv1/weights:0')

predictions = sess.run(softmax_tensor, \
                       {'DecodeJpeg/contents:0': [image_data]})

# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]

for node_id in top_k:
    human_string = label_lines[node_id]
    score = predictions[0][node_id]
    print('%s (score = %.5f)' % (human_string, score))

(Note that, depending on your graph, wrapping your input_data in a list may increase the rank of your predictions tensor, and you would need to adjust the code accordingly).