python - 使用 VGG16 预训练权重的 Imagenet 分类问题

Question

我试图在 tensorflow 中使用 VGG16 网络运行 vanilla Image 网络分类（通过 Keras 主干提供 VGG16）。

然而，当我试图对一个大象样本图像进行分类时，它给出了完全出乎意料的结果。

我无法弄清楚可能是什么问题。

这是我使用的完整代码：

import tensorflow as tf
import numpy as np
from PIL import Image
from tensorflow.python.keras._impl.keras.applications import imagenet_utils


model = tf.keras.applications.VGG16()
VGG = model.graph

VGG.get_operations()
input = VGG.get_tensor_by_name("input_1:0")
output = VGG.get_tensor_by_name("predictions/Softmax:0")
print(input)
print(output)

I = Image.open("Elephant.jpg")
new_img = I.resize((224,224))
image_array = np.array(new_img)[:, :, 0:3]
image_array = np.expand_dims(image_array, axis=0)


with tf.Session(graph=VGG) as sess:
    init_op = tf.global_variables_initializer()
    sess.run(init_op)
    pred = (sess.run(output,{input:image_array}))
    print(imagenet_utils.decode_predictions(pred))

以下是我得到的示例输出：

Tensor("input_1:0", shape=(?, 224, 224, 3), dtype=float32)
Tensor("predictions/Softmax:0", shape=(?, 1000), dtype=float32)

[[('n02281406', 'sulphur_butterfly', 0.0022673723), ('n01882714', '考拉', 0.0021256246), ('n04325704', '偷走', 0.0020583202), ('n01496331', 021', 40, 026 电. ('n01797886', 'ruffed_grouse', 0.0020229272)]]

从概率上看，传递的图像数据似乎有问题（因为所有数据都非常低）。

但我无法弄清楚出了什么问题。
而且我非常确定这张照片是大象作为人类的形象！

score 0 · Accepted Answer

我认为有两个错误，第一个是您必须通过除以 255 所有像素来重新缩放图像。

I = Image.open("Elephant.jpg")
new_img = I.resize((224,224))
image_array = np.array(new_img)[:, :, 0:3]
image_array /= 255.
image_array = np.expand_dims(image_array, axis=0)

第二点是我在查看预测值时得到的。您有一个包含 1000 个元素的向量，并且在重新缩放后它们都有 0.1% 的预测。这意味着您有一个未经训练的模型。如果在 tensorflow 中加载，我不知道该怎么做，但例如在 Keras 上你可以这样做：

app = applications.vgg16
model = app.VGG16(
        include_top=False,    # this is to have the classifier Standard from imagenet
        weights='imagenet',   # this load weight, else it's random weight
        pooling="avg")

根据我的阅读，您必须从例如 github 下载另一个包含重量的文件。

我希望它有所帮助，

编辑1：

我尝试了与 Keras 相同的模型：

from keras.applications.vgg16 import VGG16, decode_predictions
import numpy as np

model = VGG16(weights='imagenet')

I = Image.open("Elephant.jpg")
new_img = I.resize((224,224))
image_array = np.array(new_img)[:, :, 0:3]
image_array = image_array/255.
x = np.expand_dims(image_array, axis=0)

preds = model.predict(x)
print('Predicted:', decode_predictions(preds, top=5)[0])

如果我评论重新缩放，我有不好的预测：

预测：[('n03788365', 'mosquito_net', 0.22725257), ('n15075141', 'toilet_tissue', 0.026636025), ('n04209239', 'shower_curtain', 0.019786758), ('bassinet7), 4404'5, , ('n03131574', '婴儿床', 0.01316699)]

没有重新缩放，这很好：

预测：[('n02504458', 'African_elephant', 0.95870858), ('n01871265', 'tusker', 0.040065952), ('n02504013', 'Indian_elephant', 0.0012253703), ('n0170432309'9,'9 -08), ('n02454379', '犰狳', 5.0408511e-10)]

现在，如果我移除重量，我将拥有与使用 Tensorflow 的“相同”：

Predicted: [('n07717410', 'acorn_squash', 0.0010033853), ('n02980441', 'castle', 0.0010028203), ('n02124075', 'Egyptian_cat', 0.0010028186), ('n04179913', 'sewing_machine', 0.0010027955) , ('n02492660', 'howler_monkey', 0.0010027081)]

对我来说，这意味着你没有施加任何重量。也许它们已下载但未使用。

score 0 · Accepted Answer

似乎我们可以（或需要？）使用来自 Keras 的会话（它具有关联的加载图和权重），而不是在 Tensorflow 中创建一个新会话并使用从 Keras 模型获得的图，如下所示

VGG = model.graph

我认为上面的图表没有权重（这就是预测错误的原因），并且来自 Keras 会话的图表作为正确的权重（所以这两个图表实例应该不同）

以下是完整代码：

import tensorflow as tf
import numpy as np
from PIL import Image
from tensorflow.python.keras._impl.keras.applications import imagenet_utils
from tensorflow.python.keras._impl.keras import backend as K


model = tf.keras.applications.VGG16()
sess = K.get_session()
VGG = model.graph #Not needed and also doesnt have weights in it

VGG.get_operations()
input = VGG.get_tensor_by_name("input_1:0")
output = VGG.get_tensor_by_name("predictions/Softmax:0")
print(input)
print(output)

I = Image.open("Elephant.jpg")
new_img = I.resize((224,224))
image_array = np.array(new_img)[:, :, 0:3]
image_array = np.expand_dims(image_array, axis=0)
image_array = image_array.astype(np.float32)
image_array = tf.keras.applications.vgg16.preprocess_input(image_array)

pred = (sess.run(output,{input:image_array}))
print(imagenet_utils.decode_predictions(pred))

这给出了预期的结果：

[[('n02504458', 'African_elephant', 0.8518132), ('n01871265', '象牙', 0.1398836), ('n02504013', 'Indian_elephant', 0.0082286), ('n01704323', '三角龙-3.3', 6648 05), ('n02397096', '疣猪', 1.8662439e-06)]]

感谢Idavid关于使用 preprocess_input() 函数的提示和Nicolas关于卸载权重的提示。

python - 使用 VGG16 预训练权重的 Imagenet 分类问题

2 回答 2

Related

Reference