4

关于为什么 我们的训练损失平滑的并且我们的验证损失是跨时期的嘈杂(见链接)的任何想法?我们正在使用本次 Kaggle 比赛提供的眼底照片数据集实施糖尿病视网膜病变检测(二元分类)的深度学习模型。我们正在使用带有Tensorflow后端的 Keras 2.0 。

由于数据集太大而无法放入内存,我们使用fit_generatorImageDataGenerator从训练和验证文件夹中随机获取图像:

# TRAIN THE MODEL
model.fit_generator(
    train_generator,
    steps_per_epoch= train_generator.samples // training_batch_size,
    epochs=int(config['training']['epochs']),
    validation_data=validation_generator,
    validation_steps= validation_generator.samples // validation_batch_size,
    class_weight=None)

我们的 CNN 架构是 VGG16,最后两个全连接层的 dropout = 0.5,仅在第一个全连接层之前进行批量标准化,以及数据增强(包括水平和垂直翻转图像)。我们的训练和验证样本使用训练集均值和标准差进行归一化。批量大小为 32。我们的激活是 a sigmoid,损失函数是binary_crossentropy你可以在 Github 中找到我们的实现

它绝对与过度拟合无关,因为我们尝试使用高度正则化的模型并且行为完全相同。它与验证集的抽样有关吗?你们之前有没有遇到过类似的问题?

谢谢!!

4

1 回答 1

0

I would look, in that order:

  • bug in validation_generator implementation (incl. steps - does it go through all pics reserved for validation?)
  • in validation_generator, do not use augmentation (reason: an augmentation might be bad, not learnable, and at train, it does achieve a good score only by hard-coding relationships which are not generalizable)
  • change train/val split to 50/50
  • calculate, via a custom callback, the validation loss at the end of the epoch (use the same function, but calling it with a callback produces different (more accurate, at certain, non-standard models) results)

If nothing of the above gives a more smooth validation loss curve, then my next assumption would be that this is the way it is, and I might need to work on the model architecture

于 2018-10-04T10:10:21.267 回答