我正在尝试为参数估计编写一个简单的脚本(这里的参数是权重)。当 .grad() 返回 None 时,我遇到了问题。我也经历了这个 和这个链接,并在理论上和实践上理解了这个概念。对我来说,以下脚本应该可以工作,但不幸的是,它不起作用。
我的第一次尝试:以下脚本是我的第一次尝试
alpha_xy = torch.tensor(3.7, device=device, dtype=torch.float, requires_grad=True)
beta_y = torch.tensor(1.5, device=device, dtype=torch.float, requires_grad=True)
alpha0 = torch.tensor(1.1, device=device, dtype=torch.float, requires_grad=True)
alpha_y = torch.tensor(0.9, device=device, dtype=torch.float, requires_grad=True)
alpha1 = torch.tensor(0.1, device=device, dtype=torch.float, requires_grad=True)
alpha2 = torch.tensor(0.9, device=device, dtype=torch.float, requires_grad=True)
alpha3 = torch.tensor(0.001, device=device, dtype=torch.float, requires_grad=True)
learning_rate = 1e-4
total_loss = []
for epoch in tqdm(range(500)):
loss_1 = 0
for j in range(x_train.size(0)):
input = x_train[j:j+1]
target = y_train[j:j+1]
input = input.to(device,non_blocking=True)
target = target.to(device,non_blocking=True)
x_dt = gamma*input[0][0] + \
alpha_xy*input[0][0]*input[0][2] + \
alpha1*input[0][0]
y0_dt = beta_y*input[0][0] + \
alpha2*input[0][1]
y_dt = alpha0*input[0][1] + \
alpha_y*input[0][2] + \
alpha3*input[0][0]*input[0][2]
pred = torch.tensor([[x_dt],
[y0_dt],
[y_dt]],device=device
)
loss = (pred - target).pow(2).sum()
loss_1 += loss
loss.backward()
print(pred.grad, x_dt.grad, gamma.grad)
上面的代码抛出错误信息
element 0 of tensors does not require grad and does not have a grad_fn
在线loss.backward()
我的尝试 2:第一次尝试的改进如下:
gamma = torch.tensor(2.0, device=device, dtype=torch.float, requires_grad=True)
alpha_xy = torch.tensor(3.7, device=device, dtype=torch.float, requires_grad=True)
beta_y = torch.tensor(1.5, device=device, dtype=torch.float, requires_grad=True)
alpha0 = torch.tensor(1.1, device=device, dtype=torch.float, requires_grad=True)
alpha_y = torch.tensor(0.9, device=device, dtype=torch.float, requires_grad=True)
alpha1 = torch.tensor(0.1, device=device, dtype=torch.float, requires_grad=True)
alpha2 = torch.tensor(0.9, device=device, dtype=torch.float, requires_grad=True)
alpha3 = torch.tensor(0.001, device=device, dtype=torch.float, requires_grad=True)
learning_rate = 1e-4
total_loss = []
for epoch in tqdm(range(500)):
loss_1 = 0
for j in range(x_train.size(0)):
input = x_train[j:j+1]
target = y_train[j:j+1]
input = input.to(device,non_blocking=True)
target = target.to(device,non_blocking=True)
x_dt = gamma*input[0][0] + \
alpha_xy*input[0][0]*input[0][2] + \
alpha1*input[0][0]
y0_dt = beta_y*input[0][0] + \
alpha2*input[0][1]
y_dt = alpha0*input[0][1] + \
alpha_y*input[0][2] + \
alpha3*input[0][0]*input[0][2]
pred = torch.tensor([[x_dt],
[y0_dt],
[y_dt]],device=device,
dtype=torch.float,
requires_grad=True)
loss = (pred - target).pow(2).sum()
loss_1 += loss
loss.backward()
print(pred.grad, x_dt.grad, gamma.grad)
# with torch.no_grad():
# gamma -= leraning_rate * gamma.grad
现在脚本正在运行,但除了 pred.gred 其他两个返回 None。
我想在计算 loss.backward() 后更新所有参数并更新它们,但由于 None 没有发生。谁能建议我如何改进这个脚本?谢谢。