keras - 更改 VGG16 应用程序的输入张量形状

Question

我想将形状为 (160,320,3) 的图像提供给

 VGG16(input_tensor=input_tensor, include_top=False)

如何包含一个图层，将图像重塑为 VGG16 模型预期的形状，即 (224,224,3) ？

score 18 · Accepted Answer

VGG16模型本身只是一组固定层序列和固定卷积核大小等的权重。这并不意味着这些卷积核不能应用于其他大小的图像。

例如在你的情况下：

from keras.models import Model
from keras.layers import Dense,Flatten
from keras.applications import vgg16
from keras import backend as K

model = vgg16.VGG16(weights='imagenet', include_top=False, input_shape=(160,320,3))
model.summary(line_length=150)

flatten = Flatten()
new_layer2 = Dense(10, activation='softmax', name='my_dense_2')

inp2 = model.input
out2 = new_layer2(flatten(model.output))

model2 = Model(inp2, out2)
model2.summary(line_length=150)

根据here，最小图像尺寸可以48x48x3高于此值。

现在确实是在224,224,3形状图像上学习了原始权重，但过滤器权重对于具有新图像集的新任务而言是非常好的起点。您确实需要重新训练网络，但网络会很快收敛。这是迁移学习的基础。

score 0 · Accepted Answer

您需要做两件事：

通过为图像宽度和高度定义无，显式声明输入形状具有可变大小的输入。
不要使用 flatten()，因为它依赖于固定的输入形状。取而代之的是使用 GlobalMaxPooling，它不仅会进行自适应池化，而且还会展平输入张量以供 FC 处理。

我希望这将帮助您实现您的目标。

score 0 · Accepted Answer

您可以使用 Opencv 库的 resize() 函数。

 import cv2
    width = int(224)
    height = int(224)
    dim = (width, height)
    '''images contains original dimension image array'''
    resized_images=[]
    for i in range(0,images.shape[0]):
           resized = cv2.resize(images[i], dim, interpolation = cv2.INTER_AREA)
           resized_images.append(resized)

keras - 更改 VGG16 应用程序的输入张量形状

3 回答 3

Related

Reference