2

我正在构建一个神经机器翻译器,我必须使用两个不同的 LSTM 单元(一个用于编码器,一个用于解码)。

这两个单元格具有不同的形状:

  • 编码器(第一个)被输入输入句子的标记并产生一个状态向量
  • 解码器(第二个)被输入之前的状态向量,以及自己生成的令牌

我在 Tensorflow 中编写了这个,当我运行脚本时,出现以下错误(在解码器阶段引发):

  outputs, states = tf.nn.rnn(cell_backward, inputs, initial_state=initial_state)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 158, in rnn
    (output, state) = call_cell()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 145, in <lambda>
    call_cell = lambda: cell(input_, state)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn_cell.py", line 520, in __call__
    dtype, self._num_unit_shards)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn_cell.py", line 357, in _get_concat_variable
    sharded_variable = _get_sharded_variable(name, shape, dtype, num_shards)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn_cell.py", line 387, in _get_sharded_variable
    dtype=dtype))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 732, in get_variable
    partitioner=partitioner, validate_shape=validate_shape)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 596, in get_variable
    partitioner=partitioner, validate_shape=validate_shape)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 161, in get_variable
    caching_device=caching_device, validate_shape=validate_shape)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 437, in _get_single_variable
    name, "".join(traceback.format_list(tb))))
ValueError: Variable backward/RNN/LSTMCell/W_0 already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:

  File "/home/alexis/Documents/NMT/NMT.py", line 88, in dense_to_vector_state
    outputs, states = tf.nn.rnn(cell_forward, inputs, initial_state=initial_state)

如何明确指定我要创建一个全新的 LSTM 单元?

提前致谢 !

亚历克西斯

4

2 回答 2

1

我正在尝试进行机器翻译。这是我的编码器和解码器。您只需要为每个 rnn 使用不同的变量范围。我没有为编码器使用 MultiRNNCell 单元,而是手动展开每一层,这让我可以在层之间交替方向。看看每一层如何获得自己的范围。

with tf.variable_scope('encoder'):
    rnn_cell = tf.nn.rnn_cell.LSTMCell(512, num_proj = 256, state_is_tuple = True)
    for level in range(3):
        with tf.variable_scope('level_%d' % level) as scope:
            state = [tf.zeros((BATCH_SIZE, sz)) for sz in rnn_cell.state_size]
            for t in range(TIME_STEPS) if level % 2 else reversed(range(TIME_STEPS)):
                y[t], state = rnn_cell(y[t], state)
                scope.reuse_variables()


with tf.variable_scope('decoder') as scope:
    rnn_cell = tf.nn.rnn_cell.MultiRNNCell \
    ([
        tf.nn.rnn_cell.LSTMCell(512, num_proj = 256, state_is_tuple = True),
        tf.nn.rnn_cell.LSTMCell(512, num_proj = WORD_VEC_SIZE, state_is_tuple = True)
    ], state_is_tuple = True)

    state = [[tf.zeros((BATCH_SIZE, sz)) for sz in sz_outer] for sz_outer in rnn_cell.state_size]

    W_soft = tf.get_variable('W_soft', shape = (NWORDS, WORD_VEC_SIZE), initializer = tf.truncated_normal_initializer(0.0, 1 / np.sqrt(WORD_VEC_SIZE)))
    b_soft = tf.get_variable('b_soft', shape = (NWORDS,), initializer = tf.truncated_normal_initializer(0.0, 0.01))
    cost = 0
    output = [None] * TIME_STEPS

    for t in range(TIME_STEPS):
        if t:
            last = y_[t - 1] if TRAINING else y[t - 1]
        else:
            last = tf.zeros((BATCH_SIZE, WORD_VEC_SIZE))

        y[t] = tf.concat(1, (y[t], last))
        y[t], state = rnn_cell(y[t], state)

        cost += tf.reduce_mean(tf.nn.sampled_softmax_loss(W_soft, b_soft, y[t], target_output[:, t : t + 1], 1000, NWORDS))
        output[t] = tf.reshape(tf.nn.softmax(tf.matmul(y[t], W_soft, transpose_b = True) + b_soft), (BATCH_SIZE, 1, NWORDS))

        scope.reuse_variables()

    output = tf.concat(1, output)
    cost /= TIME_STEPS
于 2016-07-17T20:58:33.130 回答
1

使用变量范围

with tf.variable_scope('enc'):
  cell_enc = LSTMCell(hidden_size)
with tf.variable_scope('dec'):
  cell_dec = LSTMCell(hidden_size)
于 2016-07-17T19:02:05.333 回答