python - Tensorflow：“GraphDef 不能大于 2GB。” 分配变量后保存模型时出错

Question

我想用一个预训练的模型来热情地启动另一个略有不同的模型。简单地说，我创建了一个新模型，并为具有相同名称的变量分配了预训练的模型权重。但是，保存模型时，出现错误。

Traceback (most recent call last): File "tf_test.py", line 23, in <module> save_path = saver.save(sess, "./model.ckpt") File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1308, in save self.export_meta_graph(meta_graph_filename) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1331, in export_meta_graph graph_def=ops.get_default_graph().as_graph_def(add_shapes=True), File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2268, in as_graph_def result, _ = self._as_graph_def(from_version, add_shapes) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2231, in _as_graph_def raise ValueError("GraphDef cannot be larger than 2GB.") ValueError: GraphDef cannot be larger than 2GB.

示例代码如下：

import tensorflow as tf
import numpy as np

v1 = tf.get_variable("L_enc", [400000, 1024])
v2 = tf.get_variable("L_dec", [400000, 1024])

init_op = tf.initialize_all_variables()

saver = tf.train.Saver(tf.all_variables())

with tf.Session() as sess:
  sess.run(init_op)
  for v in tf.trainable_variables():
    embedding = np.random.uniform(-1, 1, (400000, 1024))
    sess.run(v.assign(embedding))
  # Save the variables to disk.
  save_path = saver.save(sess, "./model.ckpt")
  print("Model saved in file: %s" % save_path)

score 8 · Accepted Answer

Fabrizio正确地指出，协议缓冲区的大小有 2GB 的硬性限制，但您可能想知道为什么您的程序会达到该限制。问题源于以下几行：

for v in tf.trainable_variables():
  embedding = np.random.uniform(-1, 1, (400000, 1024))
  sess.run(v.assign(embedding))

当执行命中v.assign(embedding)时，新节点将添加到 TensorFlow 图中。特别是，每个embedding数组都被转换为一个tf.constant()张量，这将非常大（我估计大约 328MB）。

避免这种情况的最佳方法是使用tf.train.Saver. 由于模型可能具有不同的结构，您可能需要指定从旧模型中的变量名称到tf.Variable新模型中的对象的映射。

解决您的问题的另一种方法是预先创建一个tf.placeholder()操作来为每个变量分配一个值。这可能需要对您的实际代码进行更多重组，但以下内容对我有用：

v1 = tf.get_variable("L_enc", [400000, 1024])
v2 = tf.get_variable("L_dec", [400000, 1024])

# Define a separate placeholder and assign op for each variable, so
# that we can feed the initial value without adding it to the graph.
vars = [v1, v2]
placeholders = [tf.placeholder(tf.float32, shape=[400000, 1024]) for v in vars]
assign_ops = [v.assign(p) for (v, p) in zip(vars, placeholders)]

init_op = tf.global_variables_initializer()

saver = tf.train.Saver(tf.all_variables())

with tf.Session() as sess:
  sess.run(init_op)
  for p, assign_op in zip(placeholders, assign_ops):
    embedding = np.random.uniform(-1, 1, (400000, 1024))
    sess.run(assign_op, {p: embedding})

  # Save the variables to disk.
  save_path = saver.save(sess, "./model.ckpt")
  print("Model saved in file: %s" % save_path)

score 0 · Accepted Answer

由于 protobuf 中的 32 位有符号大小，序列化单个张量的硬限制为 2GB。

https://github.com/tensorflow/tensorflow/issues/4291

python - Tensorflow：“GraphDef 不能大于 2GB。” 分配变量后保存模型时出错

2 回答 2

Related

Reference