I want to use a pretrained model to warmly start another model with a little difference. Simply, I create a new model, and assign the variables with same name with pretrained model weights. But, when saving the model, error occurred.
Traceback (most recent call last):
File "tf_test.py", line 23, in <module>
save_path = saver.save(sess, "./model.ckpt")
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1308, in save
self.export_meta_graph(meta_graph_filename)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1331, in export_meta_graph
graph_def=ops.get_default_graph().as_graph_def(add_shapes=True),
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2268, in as_graph_def
result, _ = self._as_graph_def(from_version, add_shapes)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2231, in _as_graph_def
raise ValueError("GraphDef cannot be larger than 2GB.")
ValueError: GraphDef cannot be larger than 2GB.
The example code is as follow:
import tensorflow as tf
import numpy as np
v1 = tf.get_variable("L_enc", [400000, 1024])
v2 = tf.get_variable("L_dec", [400000, 1024])
init_op = tf.initialize_all_variables()
saver = tf.train.Saver(tf.all_variables())
with tf.Session() as sess:
sess.run(init_op)
for v in tf.trainable_variables():
embedding = np.random.uniform(-1, 1, (400000, 1024))
sess.run(v.assign(embedding))
# Save the variables to disk.
save_path = saver.save(sess, "./model.ckpt")
print("Model saved in file: %s" % save_path)
https://github.com/tensorflow/tensorflow/issues/4291
Fabrizio correctly points out that there's a hard 2GB limit on the size of protocol buffers, but you might be wondering why your program hits that limit. The problem stems from these lines:
When the execution hits
v.assign(embedding)
, new nodes are added to the TensorFlow graph. In particular, eachembedding
array is converted to atf.constant()
tensor, which will be quite large (approximately 328MB by my estimate).The best way to avoid this is to load the variables from the previous model directly into your new model using a
tf.train.Saver
. Since the models might have a different structure, you might need to specify a mapping from the names of variables in the old model to thetf.Variable
objects in your new model.An alternative way to solve your problem would be to pre-create a
tf.placeholder()
op for assigning a value to each variable. This might require more restructuring of your actual code, but the following worked for me: