python - Pytorch 和多项式线性回归问题

Question

我已经修改了我在 Pytorch github 上找到的代码帽以适合我的数据，但是我的损失结果很大，每次迭代它们都会变大，后来变成 nan。代码没有给我任何错误，也没有损失结果，也没有预测。我有另一个处理简单线性回归的代码，一切正常。我想我在这里遗漏了一些简单的东西，但我看不到它。任何帮助，将不胜感激。

代码：

import sklearn.linear_model as lm
from sklearn.preprocessing import PolynomialFeatures
import torch
import torch.autograd
import torch.nn.functional as F
from torch.autograd import Variable


train_data = torch.Tensor([
   [40,  6,  4],
   [44, 10,  4],
   [46, 12,  5],
   [48, 14,  7],
   [52, 16,  9],
   [58, 18, 12],
   [60, 22, 14],
   [68, 24, 20],
   [74, 26, 21],
   [80, 32, 24]])
test_data = torch.Tensor([
    [6, 4],
    [10, 5],
    [4, 8]])

x_train = train_data[:,1:3]
y_train = train_data[:,0]

POLY_DEGREE = 3
input_size = 2
output_size = 1

poly = PolynomialFeatures(input_size * POLY_DEGREE, include_bias=False)
x_train_poly = poly.fit_transform(x_train.numpy())


class Model(torch.nn.Module):

    def __init__(self):
        super(Model, self).__init__()
        self.fc = torch.nn.Linear(poly.n_output_features_, output_size)
                
    def forward(self, x):
        return self.fc(x)
            
model = Model()    
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.001)

losses = []

for i in range(10):
    optimizer.zero_grad()
    outputs = model(Variable(torch.Tensor(x_train_poly)))
    print(outputs)
    loss = criterion(outputs, Variable(y_train))
    print(loss.data[0])
    losses.append(loss.data[0])
    loss.backward()    
    optimizer.step()
    if loss.data[0] < 1e-4:
        break    

print('n_iter', i)
print(loss.data[0])
plt.plot(losses)
plt.show()

输出：

[393494300459008.0，inf，inf，inf，南，南，南，南，南，南]

硝

9楠

score 4 · Accepted Answer

有几件事会导致这个问题。改变其中的一部分或全部会给你带来合理的结果并使学习成为可能。

您的某些（多项式）特征具有巨大的差异，并且具有非常大的值。看看np.max(x_train_poly)。当你的权重矩阵被随机初始化时，这会导致初始预测很大程度上偏离，并且损失迅速接近无穷大。为了解决这个问题，您可能需要首先标准化您的特征（即，为每个特征设置均值 0 和方差 1）。请注意，在非常深的网络中，使用了一个类似的想法，称为“批量标准化”。如果您有兴趣，可以在此处阅读更多信息：https ://arxiv.org/abs/1502.03167您可以执行以下操作来修复您的示例：
```
means = np.mean(x_train_poly,axis=0,keepdims=True)
std = np.std(x_train_poly,axis=0,keepdims=True)
x_train_poly = (x_train_poly - means) / std
```
您当前的模型没有任何隐藏层，这有点像神经网络并构建非线性回归器/分类器。您现在正在做的是对 27 个输入特征应用线性变换以获得接近输出的东西。您可以像这样添加一个附加层：
```
hidden_dim = 50

class Model(torch.nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.layer1 = torch.nn.Linear(poly.n_output_features_, hidden_dim)
        self.layer2 = torch.nn.Linear(hidden_dim, output_size)

    def forward(self, x):
        return self.layer2(torch.nn.ReLU()(self.layer1(x)))
```
请注意，我在第一次线性变换之后添加了非线性，因为否则没有多层的意义。
初始预测在开始时大大偏离并导致损失接近无穷大的问题。您正在使用平方损失，这实际上使损失函数中初始“错误”的数量级增加了一倍。一旦损失为无穷大，您将无法逃脱，因为梯度更新本质上也是无穷大，因为您使用的是平方损失。一个有时有用的简单解决方法是改用平滑 L1 损失。本质上是区间 [0, 1] 上的 MSE，而 L1 在该区间外丢失。更改以下内容：
```
criterion = torch.nn.SmoothL1Loss()
```
这已经让你得到一些明智的东西（即不再有 infs），但现在考虑调整学习率并引入 weight_decay。您可能还想更改优化器。一些可行的建议：
```
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, weight_decay=1)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=0.1)
```

python - Pytorch 和多项式线性回归问题

1 回答 1

Related

Reference