2

我问自己下面的代码是只做梯度下降的一步还是做整个梯度下降算法?

opt = tf.keras.optimizers.SGD(learning_rate=self.learning_rate)   
opt = tf.keras.optimizers.SGD(learning_rate=self.learning_rate)   
train = opt.minimize(self.loss, var_list=[self.W1, self.b1, self.W2, self.b2, self.W3, self.b3])

您需要在您确定的梯度下降中执行许多步骤。但我不确定是否opt.minimize(self.loss, var_list=[self.W1, self.b1, self.W2, self.b2, self.W3, self.b3])正在执行所有步骤而不是执行梯度下降的一步。为什么我认为它会执行所有步骤?因为在那之后我的损失为零。

4

1 回答 1

0

tf.keras.optimizers.Optimizer.minimize()计算梯度并应用它们。因此,这是一个步骤。

在此函数的文档中,您可以阅读:

此方法仅使用 tf.GradientTape 计算梯度并调用 apply_gradients()。如果您想在应用之前处理渐变,请显式调用 tf.GradientTape 和 apply_gradients() 而不是使用此函数。

从最小化()的实现中也可以看出:

  def minimize(self, loss, var_list, grad_loss=None, name=None, tape=None):
    """Minimize `loss` by updating `var_list`.
    This method simply computes gradient using `tf.GradientTape` and calls
    `apply_gradients()`. If you want to process the gradient before applying
    then call `tf.GradientTape` and `apply_gradients()` explicitly instead
    of using this function.
    Args:
      loss: `Tensor` or callable. If a callable, `loss` should take no arguments
        and return the value to minimize. If a `Tensor`, the `tape` argument
        must be passed.
      var_list: list or tuple of `Variable` objects to update to minimize
        `loss`, or a callable returning the list or tuple of `Variable` objects.
        Use callable when the variable list would otherwise be incomplete before
        `minimize` since the variables are created at the first time `loss` is
        called.
      grad_loss: (Optional). A `Tensor` holding the gradient computed for
        `loss`.
      name: (Optional) str. Name for the returned operation.
      tape: (Optional) `tf.GradientTape`. If `loss` is provided as a `Tensor`,
        the tape that computed the `loss` must be provided.
    Returns:
      An `Operation` that updates the variables in `var_list`. The `iterations`
      will be automatically increased by 1.
    Raises:
      ValueError: If some of the variables are not `Variable` objects.
    """
    grads_and_vars = self._compute_gradients(
        loss, var_list=var_list, grad_loss=grad_loss, tape=tape)
    return self.apply_gradients(grads_and_vars, name=name) 
于 2022-01-16T20:47:12.557 回答