I am looking at Google's example on how to deploy and use a pre-trained Tensorflow graph (model) on Android. This example uses a .pb
file at:
https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip
which is a link to a file that downloads automatically.
The example shows how to load the .pb
file to a Tensorflow session and use it to perform classification, but it doesn't seem to mention how to generate such a .pb
file, after a graph is trained (e.g., in Python).
Are there any examples on how to do that?
Alternatively to my previous answer using
freeze_graph()
, which is only good if you call it as a script, there is a very nice function that will do all the heavy lifting for you and is suitable to be called from your normal model training code.convert_variables_to_constants()
does two things:Assuming
sess
is yourtf.Session()
and"output"
is the name of your prediction node, the following code will serialize your minimal graph both into textual and binary protobuf.Here's another take on @Mostafa's answer. A somewhat cleaner way to run the
tf.assign
ops is to store them in atf.group
. Here's my Python code:And in C++:
This way you have only one named op to run on the C++ side, so you don't have to mess around with iterating over nodes.
I found a
freeze_graph()
function in the Tensorflow codebase that might be helpful when doing this. From what I understand it swaps variables with constants before serializing the GraphDef and so when you then load this graph from C++ it has no variables that need to be set anymore, and you can directly use it for predictions.There is also a test for it and some description in the Guide.
This seems like the cleanest option here.
I could not figure out how to implement the method described by mrry. But here how I solved it. I'm not sure if that is the best way of solving the problem but at least it solves it.
As write_graph can also store the values of the constants, I added the following code to the python just before writing the graph with write_graph function:
This creates constants that store variables' values after being trained and then create tensors "assign_variables" to assign them to the variables. Now, when you call write_graph, it will store the variables' values in the file in form of constants.
The only remaining part is to call these tensors "assign_variables" in the c code to make sure that your variables are assigned with the constants values that are stored in the file. Here is a one way to do it:
EDIT: The
freeze_graph.py
script, which is part of the TensorFlow repository, now serves as a tool that generates a protocol buffer representing a "frozen" trained model, from an existing TensorFlowGraphDef
and a saved checkpoint. It uses the same steps as described below, but it much easier to use.Currently the process isn't very well documented (and subject to refinement), but the approximate steps are as follows:
tf.Graph
calledg_1
.Session.run()
).tf.Graph
calledg_2
, createtf.constant()
tensors for each of the variables, using the value of the corresponding numpy array fetched in step 2.Use
tf.import_graph_def()
to copy nodes fromg_1
intog_2
, and use theinput_map
argument to replace each variable ing_1
with the correspondingtf.constant()
tensors created in step 3. You may also want to useinput_map
to specify a new input tensor (e.g. replacing an input pipeline with atf.placeholder()
). Use thereturn_elements
argument to specify the name of the predicted output tensor.Call
g_2.as_graph_def()
to get a protocol buffer representation of the graph.(NOTE: The generated graph will have extra nodes in the graph for training. Although it is not part of the public API, you may wish to use the internal
graph_util.extract_sub_graph()
function to strip these nodes from the graph.)Just found this post and it was very useful thanks! I'm also going with @Mostafa's method, though my C++ code is a bit different:
NB I use "var_hack" as my variable name in python