pytorch - 关于使用 PyTorch 编写 cnn 的问题

Question

我对 cnn 编程很陌生，所以我有点迷茫。我正在尝试执行这部分代码，他们要求我实现一个完全连接的网络来对数字进行分类。它应该包含 1 个具有 20 个单元的隐藏层。我应该在隐藏层上使用 ReLU 激活函数。

class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.fc1 = ... 
        
        self.fc2 = nn.Sequential(
            nn.Linear(500,10),
            nn.Softmax(dim = 1)
            )
        
    def forward(self, x):
        x = x.view(x.size(0),-1)
        x = self.fc1(x)
        x = self.fc2(x)
        return x

点是要填充的部分，我想到了这一行：

self.fc1 = nn.Linear(20, 500)

但我不知道它是否正确。有人可以帮我吗？而且我根本不明白 Softmax 的功能是什么......所以如果有人知道，请。太感谢了！！

钯。这是加载数据的代码：

batch_size = 64
trainset = datasets.MNIST('./data', train=True, download=True, transform=transforms.ToTensor())
train_loader = DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=1)
testset = datasets.MNIST('./data', train=False, download=True, transform=transforms.ToTensor())
test_loader = DataLoader(testset, batch_size=batch_size, shuffle=False, num_workers=1)

score 0 · Accepted Answer

从模型给出的代码可以看出，隐藏层有 500 个单元。所以我假设你的意思是输入 20 个单位。有了这个假设，代码必须是：

self.fc1 = nn.Sequential(
    nn.Linear(20, 500),
    nn.ReLU()
    )

来到问题的下一部分，鉴于您正在使用 MNIST 数据集并且您拥有 softmax 函数，我假设您正在尝试预测图像中存在的数字。您的神经网络在每一层执行各种乘法和加法运算，最后，您在输出层中得到 10 个数字。现在，您必须了解这 10 个数字来决定图像中给出的 10 个数字中的哪一个。

一种方法是选择具有最大值的单位。例如，如果第 10 个单元在所有单元中具有最大值，那么我们可以断定该数字是“9”。如果第 2 个单位具有最大值，则我们得出结论该数字为“1”。

这很好，但更好的方法是将每个单元的值转换为相应数字包含在图像中的概率，然后我们选择具有最高概率的数字。这具有一定的数学优势，可以帮助我们定义更好的损失函数。

Softmax 帮助我们将值转换为概率。在应用 softmax 时，所有值都在 (0, 1) 范围内，它们的总和为 1。

如果您对深度学习及其背后的数学感兴趣，我建议您查看 Andrew NG 的深度学习课程。

score 0 · Accepted Answer

You did not mention the shape of your data so I'll be assuming the expected shape returned by datasets.MNIST.

Data shape: torch.Size([64, 1, 28, 28])

class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.fc1 = nn.Sequential(
            nn.Linear(1*28*28, 20),
            nn.ReLU())
        
        self.fc2 = nn.Sequential(
            nn.Linear(500,10),
            nn.Softmax(dim = 1))
        
    def forward(self, x):
        x = x.view(x.size(0), -1)
        x = self.fc1(x)
        x = self.fc2(x)
        return x

The first argument of nn.Linear is the size of input feature while the second is the number of units.

For self.fc1, the size of the input feature is the multiplication of your data shape except the batch size, which is 1 * 28 * 28. And as per your post the second argument should be 20 (20 units).

The shape of the output from self.fc1 (which is also the input to self.fc2) will then be (batch size, 20).

For self.fc2, the size of the input feature will be 20 while the number of units (which is also the number of digits) will be 10.

pytorch - 关于使用 PyTorch 编写 cnn 的问题

2 回答 2

Related

Reference