python - 如何在 TensorFlow 2.0 中使用 Dataset.window() 方法创建的窗口？

Question

我正在尝试使用 TensorFlow 2.0 创建一个数据集，该数据集将返回时间序列中的随机窗口，以及作为目标的下一个值。

我正在使用Dataset.window()，看起来很有希望：

import tensorflow as tf

dataset = tf.data.Dataset.from_tensor_slices(tf.range(10))
dataset = dataset.window(5, shift=1, drop_remainder=True)
for window in dataset:
    print([elem.numpy() for elem in window])

输出：

[0, 1, 2, 3, 4]
[1, 2, 3, 4, 5]
[2, 3, 4, 5, 6]
[3, 4, 5, 6, 7]
[4, 5, 6, 7, 8]
[5, 6, 7, 8, 9]

但是，我想使用最后一个值作为目标。如果每个窗口都是张量，我会使用：

dataset = dataset.map(lambda window: (window[:-1], window[-1:]))

但是，如果我尝试这个，我会得到一个例外：

TypeError: '_VariantDataset' object is not subscriptable

score 31 · Accepted Answer

解决方案是这样调用flat_map()：

dataset = dataset.flat_map(lambda window: window.batch(5))

现在数据集中的每个项目都是一个窗口，因此您可以像这样拆分它：

dataset = dataset.map(lambda window: (window[:-1], window[-1:]))

所以完整的代码是：

import tensorflow as tf

dataset = tf.data.Dataset.from_tensor_slices(tf.range(10))
dataset = dataset.window(5, shift=1, drop_remainder=True)
dataset = dataset.flat_map(lambda window: window.batch(5))
dataset = dataset.map(lambda window: (window[:-1], window[-1:]))

for X, y in dataset:
    print("Input:", X.numpy(), "Target:", y.numpy())

哪个输出：

Input: [0 1 2 3] Target: [4]
Input: [1 2 3 4] Target: [5]
Input: [2 3 4 5] Target: [6]
Input: [3 4 5 6] Target: [7]
Input: [4 5 6 7] Target: [8]
Input: [5 6 7 8] Target: [9]

python - 如何在 TensorFlow 2.0 中使用 Dataset.window() 方法创建的窗口？

1 回答 1

Related

Reference