pytorch - 了解 pytorch autograd

Question

我试图了解 pytorch autograd 的工作原理。如果我有函数 y = 2x 和 z = y**2，如果我进行正态微分，我得到 x = 1 处的 dz/dx 为 8 (dz/dx = dz/dy * dy/dx = 2y*2 = 2 (2x)*2 = 8x)。或者，z = (2x)**2 = 4x^2 和 dz/dx = 8x，所以在 x = 1 时，它是 8。

如果我对 pytorch autograd 做同样的事情，我会得到 4

x = torch.ones(1,requires_grad=True)
y = 2*x
z = y**2
x.backward(z)
print(x.grad)

哪个打印

tensor([4.])

我哪里错了？

score 5 · Accepted Answer

你用Tensor.backward错了。要获得您要求的结果，您应该使用

x = torch.ones(1,requires_grad=True)
y = 2*x
z = y**2
z.backward()  # <-- fixed
print(x.grad)

调用z.backward()调用反向传播算法，从z计算图中的每个叶节点开始并返回到每个叶节点。在这种情况下x是唯一的叶节点。调用z.backward()计算图后，计算图被重置.grad，每个叶节点的成员都用相对于叶节点的梯度z（在本例中为 dz/dx）更新。

您的原始代码中实际发生了什么？好吧，您所做的是从x. 由于 dx/dx = 1，没有参数x.backward()只会导致x.grad设置为。附加参数 ( ) 实际上是应用于结果梯度的比例。在这种情况下，你得到. 如果有兴趣，您可以查看此内容以获取有关参数作用的更多信息。1gradientz=4x.grad = z * dx/dx = 4 * 1 = 4gradient

score 0 · Accepted Answer

如果您对 pytorch 中的 autograd 仍有一些困惑，请参考：这将是基本的异或门表示

import numpy as np
import torch.nn.functional as F
inputs = torch.tensor(
                [
                    [0, 0],
                    [0, 1],
                    [1, 0],
                    [1, 1]
                ]
            )
outputs = torch.tensor(
                [
                    0,
                    1,
                    1,
                    0
                ],
        )
weights = torch.randn(1, 2)
weights.requires_grad = True #set it as true for gradient computation

bias = torch.randn(1, requires_grad=True) #set it as true for gradient computation

preds = F.linear(inputs, weights, bias) #create a basic linear model
loss = (outputs - preds).mean()
loss.backward()
print(weights.grad) # this will print your weights

pytorch - 了解 pytorch autograd

2 回答 2

Related

Reference