python - 如何在 Keras 中为自动编码器打乱训练数据

Question

我在 Keras 中使用自动编码器。我正在寻找对训练数据进行洗牌，x_train以便自动编码器将数据重建为来自同一类的不同样本。这可能吗？

model_train = autoencoder.fit(x_train, x_train,
          batch_size=32,
          epochs=1000,
          shuffle=True,
          callbacks=[checkpoint, early_stopping],
          validation_data=(x_test, x_test))

我认为这是基于相同shuffle=True对的洗牌和计算损失，这不是我想要的。x_train

score 1 · Accepted Answer

这是可能的，但 Keras 不会为你做这件事，因为它会将数据和标签混在一起。假设你有标签，我发现这个函数对你的目的非常有用：

import numpy as np

def create_pairs(data, labels):
    # Exclude batch dimension
    pairs = np.empty(0, 2, *data.shape[1:])

    for label in np.unique(labels):
        idxs = np.where(labels == label)[0]
        # Indexes must be even in order to create pairs
        idxs = idxs if len(idxs) % 2 == 0 else idxs[:-1]
        np.random.shuffle(idxs)

        samples = data[idxs].reshape((-1, 2, *data.shape[1:]))
        pairs = np.vstack((pairs, samples))
    return pairs[:, 0], pairs[:, 1]

现在数据被打乱并分成对，你可以训练你的模型：

x_train, y_train = create_pairs(data, labels)
history = model.fit(
    x_train, y_train,
    batch_size=32,
    epochs=1000,
    shuffle=True,
    callbacks=[checkpoint, early_stopping],
    validation_split=0.2)

python - 如何在 Keras 中为自动编码器打乱训练数据

1 回答 1

Related

Reference