pytorch - 在 pytorch 中复制 https://www.d2l.ai/chapter_linear-networks/linear-regression-scratch.html

Question

我正在尝试在 pytorch 中复制代码。但是我在使用 autograd 函数时遇到了一些问题。我有以下运行时错误。

RuntimeError：尝试第二次向后遍历图形

代码如下：

for epoch in range(num_epochs):
    # Assuming the number of examples can be divided by the batch size, all
    # the examples in the training data set are used once in one epoch
    # iteration. The features and tags of mini-batch examples are given by X
    # and y respectively
    for X, y in data_iter(batch_size, features, labels):
        print (X)
        print (y)
        l = loss(net(X,w,b) , y)
        print (l)
        l.backward(retain_graph=True)
        print (w.grad)
        print (b.grad)

        with torch.no_grad():
          w -= w.grad * 1e-5/batch_size
          b -= b.grad * 1e-5/batch_size 
          w.grad.zero_()
          b.grad.zero_()

有人可以解释一下 autograd 在 python 中是如何工作的吗？如果有人可以向我推荐一个学习 pytorch 的好资源，那就太好了。

score 0 · Accepted Answer

Pytorch 与 TensorFlow 的不同之处在于其动态计算图。为了节省内存，一旦不再使用，Pytorch 将删除 grpah 中的所有中间节点。也就是说，如果你想通过这些中间节点反向传播你的梯度两次或更多，你将面临麻烦。

简单的解决方案是设置retain_graph=True. 例如，

model = Autoencoder()
rec = model(x)
loss_1 = mse_loss(rec, x)
loss_2 = l1_loss(rec, x)

opt.zero_grad()
loss_1.backward(retain_graph=True)
loss_2.backward()
opt.step()

pytorch - 在 pytorch 中复制 https://www.d2l.ai/chapter_linear-networks/linear-regression-scratch.html

1 回答 1

Related

Reference