python - 使用预训练 VGG-16 模型的 Caffe 形状不匹配错误

Question

我正在使用 PyCaffe 来实现一个受 VGG 16 层网络启发的神经网络。我想使用他们的GitHub 页面上提供的预训练模型。通常，这通过匹配层名称来工作。

对于我的"fc6"层，我的 train.prototxt 文件中有以下定义：

layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  inner_product_param {
    num_output: 4096
  }
}

这是 VGG-16 部署架构的 prototxt 文件。请注意，"fc6"他们的 prototxt 中的与我的相同（除了学习率，但这无关紧要）。还值得注意的是，在我的模型中，输入的大小也都相同：3 通道 224x224px 图像。

我一直在密切关注本教程，给我带来问题的代码块如下：

solver = caffe.SGDSolver(osp.join(model_root, 'solver.prototxt'))
solver.net.copy_from(model_root + 'VGG_ILSVRC_16_layers.caffemodel')
solver.test_nets[0].share_with(solver.net)
solver.step(1)

第一行加载我的求解器 prototxt，然后第二行从预训练模型 ( VGG_ILSVRC_16_layers.caffemodel) 中复制权重。当求解器运行时，我收到此错误：

Cannot copy param 0 weights from layer 'fc6'; shape mismatch.  Source param 
shape is 1 1 4096 25088 (102760448); target param shape is 4096 32768 (134217728). 
To learn this layer's parameters from scratch rather than copying from a saved 
net, rename the layer.

它的要点是他们的模型期望层的大小为 1x1x4096，而我的只有 4096。但我不明白如何改变这个？

我在用户谷歌组中找到了这个答案，指示我在复制之前进行网络手术以重塑预训练模型，但为了做到这一点，我需要lmdb原始架构数据层中的文件，而我没有（它当我尝试运行网络手术脚本时抛出错误）。

score 7 · Accepted Answer

问题不在于 4096，而在于 25088。您需要根据输入特征图计算网络每一层的输出特征图。请注意，该fc层采用固定大小的输入，因此前conv一层的输出必须与该fc层所需的输入大小相匹配。使用上一层的输入特征图大小计算你的 fc6 输入特征图大小（这是上conv一层的输出特征图）conv。这是公式：

H_out = ( H_in + 2 x Padding_Height - Kernel_Height ) / Stride_Height + 1
W_out = (W_in + 2 x Padding_Width - Kernel_Width) / Stride_Width + 1

score 0 · Accepted Answer

如果您将图像裁剪为 224，而不是使用原始数据集完成的 227，则会出现此错误。调整它，你应该很高兴。

python - 使用预训练 VGG-16 模型的 Caffe 形状不匹配错误

2 回答 2

Related

Reference