tensorflow - TensorFlow：lstm dropout 实现，形状问题

Question

我正在使用 TensorFlow 中的 lstm 模型进行预测项目。实现的结构有效，然而，得到了一个糟糕的结果，测试集的准确率只有 0.5。因此，我搜索了是否存在一些训练基于 lstm 的模型的技巧。然后我得到了“添加辍学”。

但是，按照其他人的教程，会出现一些错误。

这是原始版本，它有效：

def lstmModel(x, weights, biases):
    x = tf.unstack(x, time_step, 1)

    lstm_cell = tf.nn.rnn_cell.LSTMCell(n_hidden, state_is_tuple=True, forget_bias=1)
    outputs, states = rnn.static_rnn (lstm_cell, x, dtype=tf.float32)rnn.static_rnn)

    return tf.matmul(outputs[-1], weights['out']) + biases['out']

更改为以下内容后，出现错误：

ValueError：形状（90，？）必须至少有 3 位

def lstmModel(x, weights, biases):
    x = tf.unstack(x, time_step, 1)

    lstm_cell = tf.nn.rnn_cell.LSTMCell(n_hidden, state_is_tuple=True, forget_bias=1)
    lstm_dropout = tf.nn.rnn_cell.DropoutWrapper(lstm_cell, output_keep_prob=0.5)
    lstm_layers = rnn.MultiRNNCell([lstm_dropout]* 3)
    outputs, states = tf.nn.dynamic_rnn(lstm_layers, x, dtype=tf.float32)
    return tf.matmul(outputs[-1], weights['out']) + biases['out']

如果我的输入数据形状出错，我会感到困惑。在进入这个函数之前，输入x的是形状(batch_size, time_step, data_size)

batch_size = 30 
time_step = 4 #read 4 words 
data_size = 80 # total 80 words, each is in np.shape of [1,80]

x因此，每批的输入形状为[30,4,80]。并且输入x[0,0,80]的单词后面跟着单词x[0,1,80]。设计有意义吗？

整个实现实际上是由其他教程修改的，我也想知道tf.unstack()实际做了什么？

上面的几个问题......我已经将代码放在github中，上面提到了“工作版本”和“失败版本”。只有提到的功能不同！请查收，谢谢！

score 1 · Accepted Answer

tf.unstack从第二个示例中删除应该会有所帮助。

tf.unstack用于将张量分解为张量列表。在您的情况下，它会将xsize分解为包含 size 张量的(batch_size, time_step, data_size)长度列表。time_step(batch_size, data_size)

这是必需的，tf.nn.static_rnn因为它在图形创建期间展开 rnn，因此它需要预先指定的步数，即来自的列表的长度tf.unstack。

tf.nn.dynamic_rnn在每次运行中展开，以便它可以执行可变数量的步骤，因此它需要一个张量，其中维度 0 是batch_size，维度 1 是time_step，维度 2 是（或者如果是data_size，则前两个维度是相反的）。time_majorTrue

该错误是由于tf.nn.dynamic_rnn期望 3D 张量，但提供的输入列表中的每个元素都是 2D 仅由于tf.unstack.

tl;dr 与一起使用tf.unstack，tf.nn.static_rnn但从不与. 一起使用tf.nn.dynamic_rnn。

tensorflow - TensorFlow：lstm dropout 实现，形状问题

1 回答 1

Related

Reference