1

我正在尝试将 maxpool2d(来自 torch.nn)应用于单个图像(而不是作为 maxpool 层)。这是我现在的代码:

name = 'astronaut'
imshow(images[name], name)
img = images[name]
# pool of square window of size=3, stride=1
m = nn.MaxPool2d(3,stride = 1)
img_transform = torch.Tensor(images[name])
plt.imshow(m(img_transform).view((512,510)))

问题是,这段代码给了我一个非常绿色的图像。我确信问题出在视图的尺寸上,但我无法找到如何将 maxpool 应用于仅一张图像,所以我无法修复它。我正在考虑的图像尺寸是 512x512。view 的论点现在对我来说毫无意义,它只是给出结果的唯一数字......

例如,如果我将 512,512 作为 view 的参数,则会收到以下错误:

RuntimeError: shape '[512, 512]' is invalid for input of size 261120

如果有人能告诉我如何将 maxpool、avgpool 或 minpool 应用于图像并显示结果,我将非常感激!

谢谢 (:

4

1 回答 1

2

假设您的图像是numpy.array加载时的(请参阅注释以了解每个步骤的说明):

import numpy as np
import torch

# Assuming you have 3 color channels in your image
# Assuming your data is in Width, Height, Channels format
numpy_img = np.random.randint(low=0, high=255, size=(512, 512, 3))

# Transform to tensor
tensor_img = torch.from_numpy(numpy_img)
# PyTorch takes images in format Channels, Width, Height
# We have to switch their dimensions using `permute`
tensor_img = tensor_img.permute(2, 0, 1)
tensor_img.shape # Shape [3, 512, 512]

# Layers always need batch as first dimension (even for one image)
# unsqueeze will add it for you    
ready_tensor_img = tensor_img.unsqueeze(dim=0)
ready_tensor_img.shape # Shape [1, 3, 512, 512]

pooling = torch.nn.MaxPool2d(kernel_size=3, stride=1)

# You need to cast your image to float as
# pooling is not implemented for Tensors of type long
new_img = pooling(ready_tensor_img.float())

如果您的图像是黑白的,您将需要形状[1, 1, 512, 512](仅单通道),您不能离开/挤压这些尺寸,它们总是必须在那里torch.nn.Module

要将张量再次转换为图像,您可以使用类似的步骤:

# Cast to long and squeeze batch dimension
no_batch = new_img.long().squeeze(dim=0)

# Unpermute
width_height_channels = no_batch.permute(1, 2, 0)
width_height_channels.shape  # Shape: [510, 510, 3]

# Cast to numpy and you have your image
final_image = width_height_channels.numpy()
于 2020-04-05T23:45:48.267 回答