python - Tensorflow：同一图像的不同激活值

Question

我正在尝试重新训练（读取微调）MobileNet图像分类器。

tensorflow 给出的再训练脚本（来自教程），仅更新新添加的全连接层的权重。我修改了这个脚本来更新预训练模型的所有层的权重。我正在使用深度乘数为 0.25 且输入大小为 128 的 MobileNet 架构。

然而，在重新训练时，我观察到一件奇怪的事情，如果我将特定图像作为输入与其他一些图像一起进行批量推理，则某些层之后的激活值与单独传递图像时的激活值不同。来自不同批次的相同图像的激活值也不同。示例 - 对于两个批次 - batch_1 : [img1, img2, img3]; batch_2 : [img1, img4, img5]。img1 的激活与这两个批次不同。

这是我用于推理的代码 -

for tf.Session(graph=tf.get_default_graph()) as sess:
    image_path = '/tmp/images/10dsf00003.jpg'
    id_ = gfile.FastGFile(image_path, 'rb').read()

    #The line below loads the jpeg using tf.decode_jpeg and does some preprocessing
    id = sess.run(decoded_image_tensor, {jpeg_data_tensor: id_})

    input_image_tensor = graph.get_tensor_by_name('input')

    layerXname='MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu:0' #Name of the layer whose activations to inspect.
    layerX = graph.get_tensor_by_name(layerXname)
    layerXactivations=sess.run(layerX, {input_image_tensor: id})

上面的代码按原样执行一次，并在最后一行进行以下更改：

layerXactivations_batch=sess.run(layerX, {input_image_tensor: np.asarray([np.squeeze(id), np.squeeze(id), np.squeeze(id)])})

以下是图中的一些节点：

[u'input',  u'MobilenetV1/Conv2d_0/weights',  u'MobilenetV1/Conv2d_0/weights/read',  u'MobilenetV1/MobilenetV1/Conv2d_0/convolution',  u'MobilenetV1/Conv2d_0/BatchNorm/beta',  u'MobilenetV1/Conv2d_0/BatchNorm/beta/read',  u'MobilenetV1/Conv2d_0/BatchNorm/gamma',  u'MobilenetV1/Conv2d_0/BatchNorm/gamma/read',  u'MobilenetV1/Conv2d_0/BatchNorm/moving_mean',  u'MobilenetV1/Conv2d_0/BatchNorm/moving_mean/read',  u'MobilenetV1/Conv2d_0/BatchNorm/moving_variance',  u'MobilenetV1/Conv2d_0/BatchNorm/moving_variance/read',  u'MobilenetV1/MobilenetV1/Conv2d_0/BatchNorm/batchnorm/add/y',  u'MobilenetV1/MobilenetV1/Conv2d_0/BatchNorm/batchnorm/add',  u'MobilenetV1/MobilenetV1/Conv2d_0/BatchNorm/batchnorm/Rsqrt',  u'MobilenetV1/MobilenetV1/Conv2d_0/BatchNorm/batchnorm/mul',  u'MobilenetV1/MobilenetV1/Conv2d_0/BatchNorm/batchnorm/mul_1',  u'MobilenetV1/MobilenetV1/Conv2d_0/BatchNorm/batchnorm/mul_2',  u'MobilenetV1/MobilenetV1/Conv2d_0/BatchNorm/batchnorm/sub',  u'MobilenetV1/MobilenetV1/Conv2d_0/BatchNorm/batchnorm/add_1',  u'MobilenetV1/MobilenetV1/Conv2d_0/Relu6',  u'MobilenetV1/Conv2d_1_depthwise/depthwise_weights',  u'MobilenetV1/Conv2d_1_depthwise/depthwise_weights/read',   ...  ...]

现在，当layerXname = 'MobilenetV1/MobilenetV1/Conv2d_0/convolution' 上述两种指定情况下的激活相同时。（即 layerxactivations 和 layerxactivations_batch[0] 是相同的）。但是在这一层之后，所有层都有不同的激活值。我觉得'MobilenetV1/MobilenetV1/Conv2d_0/convolution'层之后的batchNorm操作对于批量输入和单个图像的行为不同。还是问题是由其他原因引起的？

任何帮助/指针将不胜感激。

score 0 · Accepted Answer

当您构建移动网络时，有一个参数称为is_training. 如果您不将其设置为 false，则 dropout 层和批标准化层将在不同的迭代中为您提供不同的结果。批量标准化可能会改变很少的值，但 dropout 会改变它们很多，因为它会丢弃一些输入值。

看一下 mobilnet 的签名：

def mobilenet_v1(inputs,
                 num_classes=1000,
                 dropout_keep_prob=0.999,
                 is_training=True,
                 min_depth=8,
                 depth_multiplier=1.0,
                 conv_defs=None,
                 prediction_fn=tf.contrib.layers.softmax,
                 spatial_squeeze=True,
                 reuse=None,
                 scope='MobilenetV1'):
  """Mobilenet v1 model for classification.

  Args:
    inputs: a tensor of shape [batch_size, height, width, channels].
    num_classes: number of predicted classes.
    dropout_keep_prob: the percentage of activation values that are retained.
    is_training: whether is training or not.
    min_depth: Minimum depth value (number of channels) for all convolution ops.
      Enforced when depth_multiplier < 1, and not an active constraint when
      depth_multiplier >= 1.
    depth_multiplier: Float multiplier for the depth (number of channels)
      for all convolution ops. The value must be greater than zero. Typical
      usage will be to set this value in (0, 1) to reduce the number of
      parameters or computation cost of the model.
    conv_defs: A list of ConvDef namedtuples specifying the net architecture.
    prediction_fn: a function to get predictions out of logits.
    spatial_squeeze: if True, logits is of shape is [B, C], if false logits is
        of shape [B, 1, 1, C], where B is batch_size and C is number of classes.
    reuse: whether or not the network and its variables should be reused. To be
      able to reuse 'scope' must be given.
    scope: Optional variable_scope.

  Returns:
    logits: the pre-softmax activations, a tensor of size
      [batch_size, num_classes]
    end_points: a dictionary from components of the network to the corresponding
      activation.

  Raises:
    ValueError: Input rank is invalid.
  """

score -1 · Accepted Answer

这是由于批量标准化。

你是如何进行推理的。您是从检查点文件加载它还是使用 Frozen Protobuf 模型。如果您使用冻结模型，您可以预期不同格式的输入会得到相似的结果。

看看这个。此处提出了针对不同应用程序的类似问题。

python - Tensorflow：同一图像的不同激活值

2 回答 2

Related

Reference