python - Pytorch 和 Keras 实现在文本分类中的巨大性能差异

Question

我已经在 Keras 和 Pytorch 中实现了 CNN，用于多标签文本分类任务。两种实现以两种截然不同的表现结束。带有 Keras 的 CNN 明显优于带有 Pytorch 的 CNN。两者都使用内核大小为 4 的一层 CNN。

Pytorch 版本的微 F1 得分为 0.023，宏 F1 得分为 0.47。该模型如下图所示（更多细节在 colab notebook 中）：

class CNN_simple(nn.Module):
    def __init__(self, vocab_size, embedding_dim, n_filters, filter_sizes, output_dim, 
                 dropout, pad_idx):       
        super().__init__()      
        self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx = pad_idx)       
        self.conv = nn.Conv1d(in_channels = embedding_dim, 
                              out_channels = n_filters,
                              kernel_size = 4) # (N,C,L)

        self.fc = nn.Linear(n_filters, output_dim)     
        self.dropout = nn.Dropout(dropout)

    def forward(self, text):      
        embedded = self.embedding(text)
        embedded = embedded.permute(0, 2, 1)
        conved = F.relu(self.conv(embedded))
        pooled = F.max_pool1d(conved, conved.shape[2]).squeeze(2)
        dropout = self.dropout(pooled)
        return self.fc(dropout)

Keras 版本的微 F1 得分为 0.70，宏 F1 得分为 0.56。该模型如下图所示（更多细节在 colab notebook 中）：

def get_model_cnn():
    inp = Input(shape=(MAX_TEXT_LENGTH, ))
    x = Embedding(MAX_VOCAB_SIZE, embed_size)(inp)

    x = Conv1D(embed_size, 4, activation="relu")(x)
    x = GlobalMaxPool1D()(x)
    x = Dense(6, activation="sigmoid")(x)

    model = Model(inputs=inp, outputs=x)
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

我认为我的 Pytorch 实现有问题。任何评论表示赞赏。我为 Pytorch 和 Keras 的完整实现创建了一个 colab notebook。随意复制并运行它。请赐教我在 Pytorch 实现中做错的任何事情。谢谢。

python - Pytorch 和 Keras 实现在文本分类中的巨大性能差异

0 回答 0

Related

Reference