machine-learning - 使用 fit_generator 时 Keras 中的嘈杂验证损失

Question

关于为什么 我们的训练损失是平滑的并且我们的验证损失是跨时期的嘈杂（见链接）的任何想法？我们正在使用本次 Kaggle 比赛提供的眼底照片数据集实施糖尿病视网膜病变检测（二元分类）的深度学习模型。我们正在使用带有Tensorflow后端的 Keras 2.0 。

由于数据集太大而无法放入内存，我们使用fit_generator，ImageDataGenerator从训练和验证文件夹中随机获取图像：

# TRAIN THE MODEL
model.fit_generator(
    train_generator,
    steps_per_epoch= train_generator.samples // training_batch_size,
    epochs=int(config['training']['epochs']),
    validation_data=validation_generator,
    validation_steps= validation_generator.samples // validation_batch_size,
    class_weight=None)

我们的 CNN 架构是 VGG16，最后两个全连接层的 dropout = 0.5，仅在第一个全连接层之前进行批量标准化，以及数据增强（包括水平和垂直翻转图像）。我们的训练和验证样本使用训练集均值和标准差进行归一化。批量大小为 32。我们的激活是 a sigmoid，损失函数是binary_crossentropy。你可以在 Github 中找到我们的实现

它绝对与过度拟合无关，因为我们尝试使用高度正则化的模型并且行为完全相同。它与验证集的抽样有关吗？你们之前有没有遇到过类似的问题？

谢谢！！

score 0 · Accepted Answer

I would look, in that order:

bug in validation_generator implementation (incl. steps - does it go through all pics reserved for validation?)
in validation_generator, do not use augmentation (reason: an augmentation might be bad, not learnable, and at train, it does achieve a good score only by hard-coding relationships which are not generalizable)
change train/val split to 50/50
calculate, via a custom callback, the validation loss at the end of the epoch (use the same function, but calling it with a callback produces different (more accurate, at certain, non-standard models) results)

If nothing of the above gives a more smooth validation loss curve, then my next assumption would be that this is the way it is, and I might need to work on the model architecture

machine-learning - 使用 fit_generator 时 Keras 中的嘈杂验证损失

1 回答 1

Related

Reference