tensorflow - 预期 conv2d_7 的形状为 (220, 220, 1) 但得到的数组形状为 (224, 224, 1)

Question

我正在按照 keras 博客（https://blog.keras.io/building-autoencoders-in-keras.html）的教程来构建一个自动编码器。

我使用了自己的数据集，并且在 224*224 大小的图像上使用了以下代码。

input_img = Input(shape=(224,224,1)) # size of the input image
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# at this point the representation is (4, 4, 8) i.e. 128-dimensional

x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

当我看到自动编码器的摘要时，它给出的输出使得最后一层有 220 x 220。我附上了该摘要的快照。

我不明白的是它是如何从 112*112 转换为 110*110 的。我期待 conv2d_6 (Conv2D) 给我 112*112 和 16 个内核。

如果我删除 Conv2D_6 层，那么它将起作用。但我想拥有它，否则我将进行两次 UpSampling。我不明白出了什么问题。

有人可以指导我吗？

score 1 · Accepted Answer

您需要添加padding='same'到该层，所以它应该如下所示：

x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)

然后它将保持相同的尺寸。没有它，您将不使用任何填充，并且由于您的内核是 3×3，因此您的 112*112 在该层之后转换为 110*110。

tensorflow - 预期 conv2d_7 的形状为 (220, 220, 1) 但得到的数组形状为 (224, 224, 1)

1 回答 1

Related

Reference