python - 学习率衰减的全局步骤有什么作用？

Question

我正在关注本教程：

https://cloud.google.com/architecture/clv-prediction-with-offline-training-train#introduction

我正在重写 Google Colab 上的一些代码。

他们使用以下方法进行学习率衰减：

initial_lr = 0.096505
learning_decay_rate = 0.7

lr_schedule = tf.compat.v1.train.exponential_decay(                    
    learning_rate = initial_lr,
    global_step = tf.compat.v1.train.get_global_step(),                                                                         
    decay_steps = checkpoint_steps,
    decay_rate = learning_decay_rate,
    staircase = True)

…我需要重建以下模型：

estimator = tf.estimator.DNNRegressor(
    feature_columns = dnn_features,
    hidden_units = [128, 64, 32, 16],
    config = tf.estimator.RunConfig(
      save_checkpoints_steps = checkpoint_steps),
    model_dir = model_dir,
    batch_norm = True,
    dropout = 0.843251,
    optimizer = tfa.optimizers.ProximalAdagrad(
        learning_rate = lr_schedule,                                                
        l1_regularization_strength = 0.0026019,
        l2_regularization_strength = 0.0107146))

tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

我不能像这样运行模型，因为我得到了

ValueError: None values not supported.

…原因是函数 get_global_step。当我使用 ie 时，我的结果与他们的结果相比非常糟糕：

lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(...)

我的问题是：

global_step 到底是什么？
模型变得更好是否至关重要？
如果我需要它：我怎样才能让它像这样工作？

python - 学习率衰减的全局步骤有什么作用？

0 回答 0

Related

Reference