0

我正在使用 Keras 调谐器调整模型BayesianOptimization。调整目标是val_loss在每个 epoch 之后计算的。据我了解,调谐器将通过各种超参数配置并训练模型,同时跟踪val_loss. 它保存模型在最低(最佳)时期的模型权重val_loss。调优后,调优方法get_best_models返回val_loss在其训练中的任何时候都具有最佳性能的模型。

但是,查看调整日志,我注意到最终假定的最佳val_loss值实际上并不是val_loss它在调整期间报告的最低值。在日志中,您可以看到在试验 64 之后“迄今为止最好的”如何val_loss增加到 0.431,而试验 64 的情况要差得多val_loss

这是日志的摘录:(我在训练中省略了日志行...

Search: Running Trial #62

Hyperparameter    |Value             |Best Value So Far 
lstm_reg          |0.01              |0                 
lstm_units        |384               |416               
learning_rate     |0.01741           |0.00062759        

Epoch 1/200
58/58 - 8s - loss: 5.8378 - mean_absolute_error: 0.8131 - val_loss: 2.1253 - val_mean_absolute_error: 0.6561
...
Epoch 26/200
58/58 - 5s - loss: 0.4074 - mean_absolute_error: 0.4579 - val_loss: 0.8352 - val_mean_absolute_error: 0.5948
Trial 62 Complete [00h 02m 37s]
val_loss: 0.5230200886726379

Best val_loss So Far: 0.396116703748703
Total elapsed time: 04h 32m 29s

Search: Running Trial #63

Hyperparameter    |Value             |Best Value So Far 
lstm_reg          |0.001             |0                 
lstm_units        |288               |416               
learning_rate     |0.00073415        |0.00062759        

Epoch 1/200
58/58 - 5s - loss: 0.8142 - mean_absolute_error: 0.6041 - val_loss: 0.8935 - val_mean_absolute_error: 0.5796
...
Epoch 45/200
58/58 - 5s - loss: 0.1761 - mean_absolute_error: 0.2561 - val_loss: 0.8256 - val_mean_absolute_error: 0.6804
Trial 63 Complete [00h 04m 04s]
val_loss: 0.527589738368988

Best val_loss So Far: 0.396116703748703
Total elapsed time: 04h 36m 34s

Search: Running Trial #64

Hyperparameter    |Value             |Best Value So Far 
lstm_reg          |0.01              |0                 
lstm_units        |384               |416               
learning_rate     |0.00011261        |0.00062759        

Epoch 1/200
58/58 - 6s - loss: 4.1151 - mean_absolute_error: 0.6866 - val_loss: 3.3185 - val_mean_absolute_error: 0.4880
...
Epoch 94/200
58/58 - 6s - loss: 0.3712 - mean_absolute_error: 0.3964 - val_loss: 0.7933 - val_mean_absolute_error: 0.5781
Trial 64 Complete [00h 09m 06s]
val_loss: 0.6574578285217285

Best val_loss So Far: 0.43126755952835083
Total elapsed time: 04h 45m 40s

Search: Running Trial #65

Hyperparameter    |Value             |Best Value So Far 
lstm_reg          |0.0001            |0                 
lstm_units        |480               |256               
learning_rate     |0.010597          |0.05              

Epoch 1/200
58/58 - 6s - loss: 1.1511 - mean_absolute_error: 0.7090 - val_loss: 1.1972 - val_mean_absolute_error: 0.6724
...

调优总结表明最好val_loss的是 0.400,即使它一定在某个时候找到了一个模型,val_loss它实际上是更好的 0.396。(确切地说是在试验 58 中)

Best val_loss So Far: 0.4001617431640625
Total elapsed time: 15h 06m 02s
Hyperparameter search complete. Optimal parameters: ...

这是创建调谐器的代码:

tuner = kt.BayesianOptimization(
        feedback_model_builder,
        objective="val_loss",
        directory="./model_tuning",
        project_name=name,
        max_trials=200
    )

并开始调整过程:

tuner.search(
        multi_window.train,
        validation_data=multi_window.val,
        callbacks=[early_stopping],
        verbose=tf_verbosity,
        epochs=200,
    )

为什么“最好”的模型没有遇到最低的val_loss?我误解了调谐器的工作原理还是这是一个错误?

4

0 回答 0