0

语境:

我正在尝试在希望为 0.95 val_acc 的kaggle 细胞数据集上训练图像分类器。我尝试了许多模型架构和 epoch 数,以及其他几个超参数,得出了一个有希望的集合,产生 0.9 val_acc。

我尝试过的事情:

  • 将图像标签对打乱在一起,因此正确的标签留在图像中
  • 标准化图像,使每个像素介于 0 和 1 之间
  • 添加BatchNormalization()Dropout()以减少过拟合(现在模型欠拟合)
  • 尝试了超参数的排列

问题:

在 0.9 处给出最佳 val_acc 的超参数集。我尝试了很多排列,我有什么遗漏/做错了吗?

模型:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 120, 160, 8)       224       
_________________________________________________________________
batch_normalization (BatchNo (None, 120, 160, 8)       32        
_________________________________________________________________
activation (Activation)      (None, 120, 160, 8)       0         
_________________________________________________________________
dropout (Dropout)            (None, 120, 160, 8)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 60, 80, 8)         584       
_________________________________________________________________
batch_normalization_1 (Batch (None, 60, 80, 8)         32        
_________________________________________________________________
activation_1 (Activation)    (None, 60, 80, 8)         0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 60, 80, 8)         584       
_________________________________________________________________
batch_normalization_2 (Batch (None, 60, 80, 8)         32        
_________________________________________________________________
activation_2 (Activation)    (None, 60, 80, 8)         0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 60, 80, 8)         0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 30, 40, 8)         584       
_________________________________________________________________
batch_normalization_3 (Batch (None, 30, 40, 8)         32        
_________________________________________________________________
activation_3 (Activation)    (None, 30, 40, 8)         0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 30, 40, 8)         584       
_________________________________________________________________
batch_normalization_4 (Batch (None, 30, 40, 8)         32        
_________________________________________________________________
activation_4 (Activation)    (None, 30, 40, 8)         0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 30, 40, 8)         0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 15, 20, 8)         584       
_________________________________________________________________
batch_normalization_5 (Batch (None, 15, 20, 8)         32        
_________________________________________________________________
activation_5 (Activation)    (None, 15, 20, 8)         0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 15, 20, 16)        3216      
_________________________________________________________________
batch_normalization_6 (Batch (None, 15, 20, 16)        64        
_________________________________________________________________
activation_6 (Activation)    (None, 15, 20, 16)        0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 15, 20, 16)        0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 8, 10, 16)         6416      
_________________________________________________________________
batch_normalization_7 (Batch (None, 8, 10, 16)         64        
_________________________________________________________________
activation_7 (Activation)    (None, 8, 10, 16)         0         
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 8, 10, 16)         6416      
_________________________________________________________________
batch_normalization_8 (Batch (None, 8, 10, 16)         64        
_________________________________________________________________
activation_8 (Activation)    (None, 8, 10, 16)         0         
_________________________________________________________________
dropout_4 (Dropout)          (None, 8, 10, 16)         0         
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 4, 5, 16)          6416      
_________________________________________________________________
batch_normalization_9 (Batch (None, 4, 5, 16)          64        
_________________________________________________________________
activation_9 (Activation)    (None, 4, 5, 16)          0         
_________________________________________________________________
flatten (Flatten)            (None, 320)               0         
_________________________________________________________________
dense (Dense)                (None, 240)               77040     
_________________________________________________________________
batch_normalization_10 (Batc (None, 240)               960       
_________________________________________________________________
dropout_5 (Dropout)          (None, 240)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 162)               39042     
_________________________________________________________________
batch_normalization_11 (Batc (None, 162)               648       
_________________________________________________________________
dropout_6 (Dropout)          (None, 162)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 84)                13692     
_________________________________________________________________
batch_normalization_12 (Batc (None, 84)                336       
_________________________________________________________________
dropout_7 (Dropout)          (None, 84)                0         
_________________________________________________________________
dense_3 (Dense)              (None, 4)                 340       
=================================================================
Total params: 158,114
Trainable params: 156,918
Non-trainable params: 1,196

激活和 val_acc、val_loss 的可视化

笔记:

优化是使用talos可以在这里找到的。我在这里编辑并添加了一些模块。


编辑1:

我使用的优化器是 Nadam,学习率为 0.0002。完整的笔记本

TLDR:

使用来自尝试大约 200 个不同超参数的测试运行中的最佳超参数在kaggle 细胞数据集上进行训练。0.9 的高原。为什么不更高?

完整的笔记本

4

1 回答 1

0

据我所知,我使用的学习率太低了。增加它似乎有帮助。

于 2019-12-10T05:15:18.337 回答