Tensorflow: “GraphDef cannot be larger than 2GB.”

2019-05-19 10:39发布

问题:

I want to use a pretrained model to warmly start another model with a little difference. Simply, I create a new model, and assign the variables with same name with pretrained model weights. But, when saving the model, error occurred.

Traceback (most recent call last): File "tf_test.py", line 23, in <module> save_path = saver.save(sess, "./model.ckpt") File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1308, in save self.export_meta_graph(meta_graph_filename) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1331, in export_meta_graph graph_def=ops.get_default_graph().as_graph_def(add_shapes=True), File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2268, in as_graph_def result, _ = self._as_graph_def(from_version, add_shapes) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2231, in _as_graph_def raise ValueError("GraphDef cannot be larger than 2GB.") ValueError: GraphDef cannot be larger than 2GB.

The example code is as follow:

import tensorflow as tf
import numpy as np

v1 = tf.get_variable("L_enc", [400000, 1024])
v2 = tf.get_variable("L_dec", [400000, 1024])

init_op = tf.initialize_all_variables()

saver = tf.train.Saver(tf.all_variables())

with tf.Session() as sess:
  sess.run(init_op)
  for v in tf.trainable_variables():
    embedding = np.random.uniform(-1, 1, (400000, 1024))
    sess.run(v.assign(embedding))
  # Save the variables to disk.
  save_path = saver.save(sess, "./model.ckpt")
  print("Model saved in file: %s" % save_path)

回答1:

Fabrizio correctly points out that there's a hard 2GB limit on the size of protocol buffers, but you might be wondering why your program hits that limit. The problem stems from these lines:

for v in tf.trainable_variables():
  embedding = np.random.uniform(-1, 1, (400000, 1024))
  sess.run(v.assign(embedding))

When the execution hits v.assign(embedding), new nodes are added to the TensorFlow graph. In particular, each embedding array is converted to a tf.constant() tensor, which will be quite large (approximately 328MB by my estimate).

The best way to avoid this is to load the variables from the previous model directly into your new model using a tf.train.Saver. Since the models might have a different structure, you might need to specify a mapping from the names of variables in the old model to the tf.Variable objects in your new model.


An alternative way to solve your problem would be to pre-create a tf.placeholder() op for assigning a value to each variable. This might require more restructuring of your actual code, but the following worked for me:

v1 = tf.get_variable("L_enc", [400000, 1024])
v2 = tf.get_variable("L_dec", [400000, 1024])

# Define a separate placeholder and assign op for each variable, so
# that we can feed the initial value without adding it to the graph.
vars = [v1, v2]
placeholders = [tf.placeholder(tf.float32, shape=[400000, 1024]) for v in vars]
assign_ops = [v.assign(p) for (v, p) in zip(vars, placeholders)]

init_op = tf.global_variables_initializer()

saver = tf.train.Saver(tf.all_variables())

with tf.Session() as sess:
  sess.run(init_op)
  for p, assign_op in zip(placeholders, assign_ops):
    embedding = np.random.uniform(-1, 1, (400000, 1024))
    sess.run(assign_op, {p: embedding})

  # Save the variables to disk.
  save_path = saver.save(sess, "./model.ckpt")
  print("Model saved in file: %s" % save_path)


回答2:

There is a hard limit of 2GB for serializing individual tensors because of the 32bit signed size in protobuf.

https://github.com/tensorflow/tensorflow/issues/4291