5

我使用 ReduceLROnPlateau 作为拟合回调来减少 LR,我正在使用 Patiente=10,因此当触发 LR 的减少时,模型可能远离最佳权重。

有没有办法回到最小的 acc_loss 并使用新的 LR 从那个点重新开始训练?

有道理吗?

我可以手动使用 EarlyStopping 和 ModelCheckpoint('best.hdf5', save_best_only=True, monitor='val_loss', mode='min') 回调,但我不知道它是否有意义。

4

2 回答 2

3

这是一个遵循@nuric 指导的工作示例:

from tensorflow.python.keras.callbacks import ReduceLROnPlateau
from tensorflow.python.platform import tf_logging as logging

class ReduceLRBacktrack(ReduceLROnPlateau):
    def __init__(self, best_path, *args, **kwargs):
        super(ReduceLRBacktrack, self).__init__(*args, **kwargs)
        self.best_path = best_path

    def on_epoch_end(self, epoch, logs=None):
        current = logs.get(self.monitor)
        if current is None:
            logging.warning('Reduce LR on plateau conditioned on metric `%s` '
                            'which is not available. Available metrics are: %s',
                             self.monitor, ','.join(list(logs.keys())))
        if not self.monitor_op(current, self.best): # not new best
            if not self.in_cooldown(): # and we're not in cooldown
                if self.wait+1 >= self.patience: # going to reduce lr
                    # load best model so far
                    print("Backtracking to best model before reducting LR")
                    self.model.load_weights(self.best_path)

        super().on_epoch_end(epoch, logs) # actually reduce LR

ModelCheckpoint 回调可用于更新最佳模型转储。例如,将以下两个回调传递给模型拟合:

model_checkpoint_path = <path to checkpoint>
c1 = ModelCheckpoint(model_checkpoint_path, 
                     save_best_only=True,
                     monitor=...)
c2 = ReduceLRBacktrack(best_path=model_checkpoint_path, monitor=...)
于 2019-03-18T19:18:52.510 回答
1

您可以创建一个继承自ReduceLROnPlateau的自定义回调,类似于:

class CheckpointLR(ReduceLROnPlateau):
   # override on_epoch_end()
   def on_epoch_end(self, epoch, logs=None):
     if not self.in_cooldown():
       temp = self.model.get_weights()
       self.model.set_weights(self.last_weights)
       self.last_weights = temp
     super().on_epoch_end(epoch, logs) # actually reduce LR
于 2018-09-07T18:08:54.240 回答