tensorflow - 无法使用 MultiRNNCell 和 dynamic_rnn 堆叠 LSTM

Question

我正在尝试建立一个多元时间序列预测模型。我按照以下教程进行温度预测。http://nbviewer.jupyter.org/github/addfor/tutorials/blob/master/machine_learning/ml16v04_forecasting_with_LSTM.ipynb

我想通过使用以下代码将他的模型扩展到多层 LSTM 模型：

cell = tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True)  
cell = tf.contrib.rnn.MultiRNNCell([cell] * num_layers,state_is_tuple=True)  
output, _ = tf.nn.dynamic_rnn(cell=cell, inputs=features, dtype=tf.float32)

但我有一个错误说：

ValueError：尺寸必须相等，但对于 'rnn/while/rnn/multi_rnn_cell/cell_0/cell_0/lstm_cell/MatMul_1'（操作：'MatMul'），尺寸必须是 256 和 142，输入形状：[?,256], [142,512] .

当我尝试这个时：

cell = []
for i in range(num_layers):
    cell.append(tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True))
cell = tf.contrib.rnn.MultiRNNCell(cell,state_is_tuple=True)
output, _ = tf.nn.dynamic_rnn(cell=cell, inputs=features, dtype=tf.float32)

我没有这样的错误，但预测真的很糟糕。

我定义 hidden=128.

features = tf.reshape(features, [-1, n_steps, n_input])(?,1,14)具有单层外壳的形状。

我的数据看起来像这样x.shape=(594,14), y.shape=(591,1)

我很困惑如何在张量流中堆叠 LSTM 单元。我的张量流版本是 0.14。

score 14 · Accepted Answer

这是一个非常有趣的问题。最初，我认为两个代码产生相同的输出（即堆叠两个LSTM 单元）。

代码 1

cell = tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True)  
cell = tf.contrib.rnn.MultiRNNCell([cell] * num_layers,state_is_tuple=True)
print(cell)

代码 2

cell = []
for i in range(num_layers):
    cell.append(tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True))
cell = tf.contrib.rnn.MultiRNNCell(cell,state_is_tuple=True)
print(cell)

但是，如果您在两种情况下都打印单元格，则会产生如下内容，

代码 1

[<tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>, <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>]

代码 2

[<tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>, <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D708B00>]

如果你仔细观察结果，

对于代码 1，打印两个LSTM 单元对象的列表，一个对象是另一个对象的副本（因为两个对象的指针相同）
对于代码 2，打印两个不同LSTM 单元对象的列表（因为两个对象的指针不同）。

堆叠两个LSTM 单元如下所示，

因此，如果你从大局考虑（实际的 TensorFlow 操作可能会有所不同），它的作用是，

首先将输入映射到LSTM 单元 1隐藏单元（在您的情况下为14到128）。
其次，将LSTM 单元 1的隐藏单元映射到LSTM 单元 2的隐藏单元（在您的情况下为128到128）。

因此，当您尝试对LSTM 单元的同一个副本执行上述两个操作时（因为权重矩阵的维度不同），就会出现错误。

但是，如果您使用隐藏单元的数量与输入单元的数量相同（在您的情况下输入为14并且隐藏为14），尽管您使用的是相同的LSTM ，但没有错误（因为权重矩阵的维度相同）细胞。

因此，如果您正在考虑堆叠两个LSTM 单元，我认为您的第二种方法是正确的。

tensorflow - 无法使用 MultiRNNCell 和 dynamic_rnn 堆叠 LSTM

1 回答 1

Related

Reference