正如标题清楚地描述了我正在经历的情况,尽管使用了Dropout
、和MaxPooling
,但我的模型仍然过拟合。此外,我尝试了各种,和. 如何进一步防止过拟合?EarlyStopping
Regularizers
CNN
learning_rate
dropout_rate
L1/L2 regularization weight decay
这是模型(Keras
在TensorFlow
后端使用):
batch_size = 128
num_epochs = 200
weight_decay = 1e-3
num_filters = 32 * 2
n_kernel_size = 5
num_classes = 3
activation_fn = 'relu'
nb_units = 128
last_dense_units = 128
n_lr = 0.001
n_momentum = 0.99
n_dr = 0.00001
dropout_rate = 0.8
model.add(Embedding(nb_words, EMBEDDING_DIM, input_length=max_seq_len))
model.add(Dropout(dropout_rate))
model.add(Conv1D(num_filters, n_kernel_size, padding='same', activation=activation_fn,
kernel_regularizer=regularizers.l2(weight_decay)))
model.add(MaxPooling1D())
model.add(GlobalMaxPooling1D())
model.add(Dense(128, activation=activation_fn, kernel_regularizer=regularizers.l2(weight_decay)))
model.add(Dropout(dropout_rate))
model.add(Dense(num_classes, activation='softmax'))
adam = Adam(lr=n_lr, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=n_dr)
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['acc'])
early_stopping = EarlyStopping(
monitor='val_loss',
patience=3,
mode='min',
verbose=1,
restore_best_weights=True
)
model.fit(...)