python - TensorFlow translate.py 教程有

Question

查看代码块末尾的值错误。运行翻译教程时发生此错误。任何想法为什么这会破裂？我正在运行 python3 并正确安装了 CUDA 和 CuDNN。而且我能够根据安装说明验证 TensorFlow 的安装，因此 CuDNN/CUDA 的基本功能应该可以正常工作。我在 Ubuntu 16.04 上使用 python3。

最近使用翻译教程的其他人有这个问题吗？当我假设本教程适用于其他人时，你知道为什么我会遇到这个问题吗？

`(tensorflow) nathan@nathan1:~/repos/tensorflow/models/tutorials/rnn/translate$ python3 translate.py --data_dir ~/data/tensorflow/translate/

Preparing WMT data in /home/nathan/data/tensorflow/translate/
2017-05-16 22:18:50.664841: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-16 22:18:50.664859: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-16 22:18:50.664864: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-05-16 22:18:50.664868: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-16 22:18:50.664872: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-05-16 22:18:50.665996: E tensorflow/stream_executor/cuda/cuda_driver.cc:405] failed call to cuInit: CUDA_ERROR_UNKNOWN
2017-05-16 22:18:50.666149: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: nathan1
2017-05-16 22:18:50.666157: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: nathan1
2017-05-16 22:18:50.666177: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 375.66.0
2017-05-16 22:18:50.666323: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:369] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module  375.66  Mon May  1 15:29:16 PDT 2017
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) 
"""
2017-05-16 22:18:50.666338: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 375.66.0
2017-05-16 22:18:50.666343: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 375.66.0
Creating 3 layers of 1024 units.
Traceback (most recent call last):
  File "translate.py", line 322, in <module>
    tf.app.run()
  File "/home/nathan/.local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "translate.py", line 319, in main
    train()
  File "translate.py", line 178, in train
    model = create_model(sess, False)
  File "translate.py", line 136, in create_model
    dtype=dtype)
  File "/home/nathan/repos/tensorflow/models/tutorials/rnn/translate/seq2seq_model.py", line 179, in __init__
    softmax_loss_function=softmax_loss_function)
  File "/home/nathan/.local/lib/python3.5/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 1201, in model_with_buckets
    decoder_inputs[:bucket[1]])
  File "/home/nathan/repos/tensorflow/models/tutorials/rnn/translate/seq2seq_model.py", line 178, in <lambda>
    lambda x, y: seq2seq_f(x, y, False),
  File "/home/nathan/repos/tensorflow/models/tutorials/rnn/translate/seq2seq_model.py", line 142, in seq2seq_f
    dtype=dtype)
  File "/home/nathan/.local/lib/python3.5/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 855, in embedding_attention_seq2seq
    encoder_cell, encoder_inputs, dtype=dtype)
  File "/home/nathan/.local/lib/python3.5/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn.py", line 197, in static_rnn
    (output, state) = call_cell()
  File "/home/nathan/.local/lib/python3.5/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn.py", line 184, in <lambda>
    call_cell = lambda: cell(input_, state)
  File "/home/nathan/.local/lib/python3.5/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 881, in __call__
    return self._cell(embedded, state)
  File "/home/nathan/.local/lib/python3.5/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 953, in __call__
    cur_inp, new_state = cell(cur_inp, cur_state)
  File "/home/nathan/.local/lib/python3.5/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 146, in __call__
    with _checked_scope(self, scope or "gru_cell", reuse=self._reuse):
  File "/usr/lib/python3.5/contextlib.py", line 59, in __enter__
    return next(self.gen)
  File "/home/nathan/.local/lib/python3.5/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 77, in _checked_scope
    type(cell).__name__))
ValueError: Attempt to reuse RNNCell <tensorflow.contrib.rnn.python.ops.core_rnn_cell_impl.GRUCell object at 0x7f0b66e04b70> with a different variable scope than its first use.  First use of cell was with scope 'embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/multi_rnn_cell/cell_0/gru_cell', this attempt is with scope 'embedding_attention_seq2seq/rnn/multi_rnn_cell/cell_0/gru_cell'.  Please create a new instance of the cell if you would like it to use a different set of weights.  If before you were using: MultiRNNCell([GRUCell(...)] * num_layers), change to: MultiRNNCell([GRUCell(...) for _ in range(num_layers)]).  If before you were using the same cell instance as both the forward and reverse cell of a bidirectional RNN, simply create two instances (one for forward, one for reverse).  In May 2017, we will start transitioning this cell's behavior to use existing stored weights, if any, when it is called with scope=None (which can lead to silent model degradation, so this error will remain until then.`

score 1 · Accepted Answer

这个问题是因为 tensorflow 本身的更新。在最近的更新中，tensorflow 不允许重用之前允许的 rnn 单元格。

rnn_cell = tf.contrib.rnn.LSTMCell(300)

output, _ = tf.nn.bidirectional_dynamic_rnn(rnn_cell, rnn_cell, data, dtype = tf.float32)

#^^^^allowed before but not now

fw_rnn_cell = tf.contrib.rnn.LSTMCell(300)
bw_rnn_cell = tf.contrib.rnn.LSTMCell(300)
output, _ = tf.nn.bidirectional_dynamic_rnn(fw_rnn_cell, bw_rnn_cell, data, dtype = tf.float32)

#^^^^allowed now

#Another example

rnn_cell = tf.contrib.rnn.LSTMCell(300)
output_layer_1, _ = tf.nn.dynamic_rnn(rnn_cell, data, dtype = tf.float32, scope = "rnn_layer_1")
output_layer_2, _ = tf.nn.dynamic_rnn(rnn_cell, output_layer_1, dtype = tf.float32, scope = "rnn_layer_2")

#^^^^allowed before but not now

rnn_cell_1 = tf.contrib.rnn.LSTMCell(300)
output_layer_1, _ = tf.nn.dynamic_rnn(rnn_cell_1, data, dtype = tf.float32, scope = "rnn_layer_1")
rnn_cell_2 = tf.contrib.rnn.LSTMCell(300)
output_layer_2, _ = tf.nn.dynamic_rnn(rnn_cell_2, output_layer_1, dtype = tf.float32, scope = "rnn_layer_2")

#^^^^allowed now

所以，你可以做什么？您可以选择：

更改为另一个较新的教程
自己修复代码
使用旧版本的张量流

python - TensorFlow translate.py 教程有

1 回答 1

Related

Reference