python - 矩阵乘法维度令人困惑

Question

我正在关注本教程https://pytorch.org/tutorials/beginner/nlp/deep_learning_tutorial.html#example-logistic-regression-bag-of-words-classifier

nn.Linear(vocab_size, num_labels) 表示矩阵形状为num_labels x vocab_size

bow_vector 尺寸是1 x vocab_size，nn.linear 的预期输入是batch_size x features

现在，我们将num_labels x vocab_size矩阵乘以1 x vocab_size。因此，维度与矩阵乘法不匹配。我在这里想念什么？：思维：

https://discuss.pytorch.org/t/matrix-multiplication-dimentions-confusing/79376?u=abhigenie92

score 0 · Accepted Answer

你误会了nn.Linear。让我为你指出一点。

nn.Linear(vocab_size, num_labels)并不意味着矩阵形状是num_labels x vacab_size.

原文是nn.Linear(input_dim, output_dim, bias=True)。假设您在 3D 空间中有 3 个点，并且您想将这些点投影到 2D 空间。所以你只需创建一个可以帮助你做到这一点的线性层 => nn.Linear(3, 2, bias=True)。

例子：

linear_function = nn.Linear(3, 2, bias=True) # you have just created a function
a_3D_point = torch.Tensor([[1, 1, 1]])
a_2D_point = linear_function(a_3D_point)

基本上，nn.Linear()只是帮助您创建一个可以进行投影的函数。

因此，您可能想知道如何nn.Linear帮助您进行投影。好吧，当投影只是y = Wx + b或y = Wx（如果偏差 = False）W是权重和b偏差并且它们都将由随机创建时，数学上很容易nn.Linear。通过以下方式检查：

print(list(linear_function.parameters()))  # Unchecked since I use my iPad to write this answer

=================

映射到您的案例，我理解的 BowClassifier 只是尝试将句子分类为有限类。最简单的方法之一是使用一个形状为的热向量n x vocab。

n表示你有n句子，但是第二维中的词汇现在扮演了代表每个句子的特征的角色。

您现在要将 n 个句子num_labels分类，只需进行投影即可。

input = ...  # shape: [n x vocab]
classify_fn = nn.Linear(vocab, num_labels)
output = classify_fn(input)

# sigmoid or softmax to get the probability here
...

python - 矩阵乘法维度令人困惑

1 回答 1

Related

Reference