python - Pytorch 在通道维度上的最大池化

Question

我试图用 Pytorch 构建一个 cnn，并且在最大池化方面遇到了困难。我拿了斯坦福大学的cs231n。我记得，maxpooling 可以用作维度扣除步骤，例如，我有这个 (1, 20, height, width) 输入 ot max_pool2d（假设我的 batch_size 是 1）。如果我使用 (1, 1) 内核，我希望得到这样的输出：(1, 1, height, width)，这意味着内核应该在通道维度上滑动。但是，在检查 pytorch 文档后，它说内核在高度和宽度上滑动。感谢 Pytorch 论坛上的 @ImgPrcSng，他告诉我使用 max_pool3d，结果效果很好。但是在 conv2d 层的输出和 max_pool3d 层的输入之间仍然存在一个 reshape 操作。所以很难聚合成一个nn.Sequential，所以我想知道有没有另一种方法可以做到这一点？

score 11 · Accepted Answer

像这样的东西会起作用吗？

from torch.nn import MaxPool1d
import torch.nn.functional as F


class ChannelPool(MaxPool1d):
    def forward(self, input):
        n, c, w, h = input.size()
        input = input.view(n, c, w * h).permute(0, 2, 1)
        pooled = F.max_pool1d(
            input,
            self.kernel_size,
            self.stride,
            self.padding,
            self.dilation,
            self.ceil_mode,
            self.return_indices,
        )
        _, _, c = pooled.size()
        pooled = pooled.permute(0, 2, 1)
        return pooled.view(n, c, w, h)

或者，使用einops

from torch.nn import MaxPool1d
import torch.nn.functional as F
from einops import rearrange


class ChannelPool(MaxPool1d):
    def forward(self, input):
        n, c, w, h = input.size()
        pool = lambda x: F.max_pool1d(
            x,
            self.kernel_size,
            self.stride,
            self.padding,
            self.dilation,
            self.ceil_mode,
            self.return_indices,
        )
        return rearrange(
            pool(rearrange(input, "n c w h -> n (w h) c")),
            "n (w h) c -> n c w h",
            n=n,
            w=w,
            h=h,
        )

score 4 · Accepted Answer

要在所有通道的每个坐标中最大池化，只需使用来自 einops 的层

from einops.layers.torch import Reduce

max_pooling_layer = Reduce('b c h w -> b 1 h w', 'max')

层可以在您的模型中用作任何其他火炬模块

python - Pytorch 在通道维度上的最大池化

2 回答 2

Related

Reference