1

我正在训练 PTB 数据集以预测字符(即字符级 LSTM)。
训练批次的维度是 [len(dataset) x words_size]。在这里,vocabulary_size = 27(26+1[对于 unk 标记和空格或句号。])。
这是将批次输入(arrX)和标签(arrY)转换为 one_hot 的代码。

arrX = np.zeros((len(train_data), vocabulary_size), dtype=np.float32)
arrY = np.zeros((len(train_data)-1, vocabulary_size), dtype=np.float32)
for i, x in enumerate(train_data):
     arrX[i, x] = 1
arrY = arrX[1, :] 

我正在 Graph 中制作输入(X)和标签(Y)的占位符,以将其传递给 tflearn LSTM。以下是图形和会话的代码。

batch_size = 256
with tf.Graph().as_default():
    X = tf.placeholder(shape=(None, vocabulary_size), dtype=tf.float32)       
    Y = tf.placeholder(shape=(None, vocabulary_size), dtype=tf.float32)      
    print (utils.get_incoming_shape(tf.concat(0, Y)))
    print (utils.get_incoming_shape(X))
    net = tflearn.lstm(X, 512, return_seq=True)
    print (utils.get_incoming_shape(net))
    net = tflearn.dropout(net, 0.5)
    print (utils.get_incoming_shape(net))
    net = tflearn.lstm(net, 256)
    net = tflearn.fully_connected(net, vocabulary_size, activation='softmax')
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(net, Y))
    optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(loss)

init = tf.initialize_all_variables()

with tf.Session() as sess:
    sess.run(init)
    offset=0
    avg_cost = 0
    total_batch = (train_length-1) / 256
    print ("No. of batches:", '%d' %total_batch)
    for i in range(total_batch) :
        batch_xs, batch_ys = trainX[offset : batch_size + offset], trainY[offset : batch_size + offset]
        sess.run(optimizer, feed_dict={X: batch_xs, Y: batch_ys})
        cost = sess.run(loss, feed_dict={X: batch_xs, Y: batch_ys})
        avg_cost += cost/total_batch
        if i % 20 == 0:
            print("Step:", '%03d' % i, "Loss:", str(cost))
        offset += batch_size    

所以,我收到以下错误assert ndim >= 3, "Input dim should be at least 3." AssertionError: Input dim should be at least 3.

我该怎么resolve this error办?有没有替代的解决方案?我应该编写单独的 LSTM 定义吗?

4

2 回答 2

0

我不习惯这类数据集,但您是否尝试过将 tflearn.input_data(shape) 与 tflearn.embedding 层一起使用?如果您使用嵌入,我想您不必在 3 维中重塑数据。

于 2017-01-10T14:36:50.203 回答
0

lstm 层接受形状 3-D 张量 [样本、时间步长、输入暗淡] 的输入。您可以将输入数据重塑为 3D。在您的问题形状中trainX[len(dataset) x vocabulary_size]. 使用trainX = trainX.reshape( trainX.shape+ (1,))shape 将更改为[len(dataset), vocabulary_size, 1]。可以通过输入占位符的简单更改将此数据传递给 lstm X。通过 为占位符添加一个维度X = tf.placeholder(shape=(None, vocabulary_size, 1), dtype=tf.float32)

于 2017-04-03T21:18:58.800 回答