2

我对机器学习真的很陌生,我目前正在使用 Tensorflow 对象检测 API 来执行对象检测,我使用的模型是 faster_rcnn_resnet101。

我正在寻找的是定义架构的python代码,例如层数(就像我附加的代码一样,它来自Tensorflow教程(https://cv-tricks.com/tensorflow-tutorial/training- convolutional-neural-network-for-image-classification/ )。Tensorflow 不像 YOLO,在那里我可以很容易地找到定义架构的地方......

非常感谢你的帮助!我想知道,在哪里可以找到定义架构的文件faster_Rcnn_resnet101

def create_convolutional_layer(input,
           num_input_channels, 
           conv_filter_size,        
           num_filters):  

      ## We shall define the weights that will be trained using create_weights function.
      weights = create_weights(shape=[conv_filter_size, conv_filter_size, num_input_channels, num_filters])
      ## We create biases using the create_biases function. These are also trained.
      biases = create_biases(num_filters)

      ## Creating the convolutional layer
      layer = tf.nn.conv2d(input=input,
                 filter=weights,
                 strides=[1, 1, 1, 1],
                 padding='SAME')

      layer += biases

      ## We shall be using max-pooling.  
      layer = tf.nn.max_pool(value=layer,
                        ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1],
                        padding='SAME')
       ## Output of pooling is fed to Relu which is the activation function for us.
       layer = tf.nn.relu(layer)

       return layer
4

2 回答 2

2

Tensorflow 使用特征提取,它使用先前网络学习的表示从新样本中提取有意义的特征。

Faster_RCNN_ResNet_101 特征提取器在此类中定义:https ://github.com/tensorflow/models/blob/master/research/object_detection/models/faster_rcnn_resnet_v1_feature_extractor.py

class FasterRCNNResnet101FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
  """Faster R-CNN Resnet 101 feature extractor implementation."""

  def __init__(self,
               is_training,
               first_stage_features_stride,
               batch_norm_trainable=False,
               reuse_weights=None,
               weight_decay=0.0):
    """Constructor.
    Args:
      is_training: See base class.
      first_stage_features_stride: See base class.
      batch_norm_trainable: See base class.
      reuse_weights: See base class.
      weight_decay: See base class.
    Raises:
      ValueError: If `first_stage_features_stride` is not 8 or 16,
        or if `architecture` is not supported.
    """
    super(FasterRCNNResnet101FeatureExtractor, self).__init__(
        'resnet_v1_101', resnet_v1.resnet_v1_101, is_training,
        first_stage_features_stride, batch_norm_trainable,
        reuse_weights, weight_decay)

正如您在完整代码的顶部看到的from object_detection.meta_architectures import faster_rcnn_meta_arch那样,Faster R-CNN 检测模型的一般 tensorflow 实现可能在https://github.com/tensorflow/models/blob/master/research/object_detection/meta_architectures中定义/faster_rcnn_meta_arch.py

于 2019-04-15T00:24:10.640 回答
1

The object detection api used tf-slim to build the models. Tf-slim is a tensorflow api that contains a lot of predefined CNNs and it provides building blocks of CNN. In object detection api, the CNNs used are called feature extractors, there are wrapper classes for these feature extractors and they provided a uniform interface for different model architectures.

For example, model faster_rcnn_resnet101 used resnet101 as a feature extractor, so there is a corresponding FasterRCNNResnetV1FeatureExtractor wrapper class in file faster_rcnn_resnet_v1_feature_extractor.py under the models directory.

from nets import resnet_utils
from nets import resnet_v1    
slim = tf.contrib.slim

In this class, you will find that they used slim to build the feature extractors. nets is a module from slim that contains a lot of predefined CNNs. So regarding your model defining code (layers), you should be able to find it in the nets module, here is resnet_v1 class.

def resnet_v1_block(scope, base_depth, num_units, stride):
  """Helper function for creating a resnet_v1 bottleneck block.
  Args:
    scope: The scope of the block.
    base_depth: The depth of the bottleneck layer for each unit.
    num_units: The number of units in the block.
    stride: The stride of the block, implemented as a stride in the last unit.
      All other units have stride=1.
  Returns:
    A resnet_v1 bottleneck block.
  """
  return resnet_utils.Block(scope, bottleneck, [{
      'depth': base_depth * 4,
      'depth_bottleneck': base_depth,
      'stride': 1
  }] * (num_units - 1) + [{
      'depth': base_depth * 4,
      'depth_bottleneck': base_depth,
      'stride': stride
  }])


def resnet_v1_50(inputs,
                 num_classes=None,
                 is_training=True,
                 global_pool=True,
                 output_stride=None,
                 spatial_squeeze=True,
                 store_non_strided_activations=False,
                 min_base_depth=8,
                 depth_multiplier=1,
                 reuse=None,
                 scope='resnet_v1_50'):
  """ResNet-50 model of [1]. See resnet_v1() for arg and return description."""
  depth_func = lambda d: max(int(d * depth_multiplier), min_base_depth)
  blocks = [
      resnet_v1_block('block1', base_depth=depth_func(64), num_units=3,
                      stride=2),
      resnet_v1_block('block2', base_depth=depth_func(128), num_units=4,
                      stride=2),
      resnet_v1_block('block3', base_depth=depth_func(256), num_units=6,
                      stride=2),
      resnet_v1_block('block4', base_depth=depth_func(512), num_units=3,
                      stride=1),
  ]
  return resnet_v1(inputs, blocks, num_classes, is_training,
                   global_pool=global_pool, output_stride=output_stride,
                   include_root_block=True, spatial_squeeze=spatial_squeeze,
                   store_non_strided_activations=store_non_strided_activations,
                   reuse=reuse, scope=scope)

The example code above explained how a resnet50 model is built (Choose resnet50 since the same concept with resnet101 but less layers). It is noticeable that resnet50 has 4 blocks with each contains [3,4,6,3] units. And here is a diagram of resnet50, there you see the 4 blocks.

enter image description here

So we are done with the resnet part, those features extracted by the first stage feature extractor (resnet101) will be fed to the proposal generator and it will generate regions, these regions together with the features, will then be fed into the box classifier for class prediction and bbox regression.

The faster_rcnn part, is specified as meta_architectures, meta_architectures are a receipe for converting classification architectures into detection architectures, in this case, from resnet101 to faster_rcnn. Here is a diagram of faster_rcnn_meta_architecture (source).

enter image description here

Here you see in the box classifier part, there are also pooling operations (for the cropped region) and convolutional operations (for extracting features from the cropped region). And in the class faster_rcnn_meta_arch, this line is the maxpool operation and the later convolution operation is performed in the feature extractor class again, but for the second stage. And you can clearly see another block being used.

def _extract_box_classifier_features(self, proposal_feature_maps, scope):
    """Extracts second stage box classifier features.
    Args:
      proposal_feature_maps: A 4-D float tensor with shape
        [batch_size * self.max_num_proposals, crop_height, crop_width, depth]
        representing the feature map cropped to each proposal.
      scope: A scope name (unused).
    Returns:
      proposal_classifier_features: A 4-D float tensor with shape
        [batch_size * self.max_num_proposals, height, width, depth]
        representing box classifier features for each proposal.
    """
    with tf.variable_scope(self._architecture, reuse=self._reuse_weights):
      with slim.arg_scope(
          resnet_utils.resnet_arg_scope(
              batch_norm_epsilon=1e-5,
              batch_norm_scale=True,
              weight_decay=self._weight_decay)):
        with slim.arg_scope([slim.batch_norm],
                            is_training=self._train_batch_norm):
          blocks = [
              resnet_utils.Block('block4', resnet_v1.bottleneck, [{
                  'depth': 2048,
                  'depth_bottleneck': 512,
                  'stride': 1
              }] * 3)
          ]
          proposal_classifier_features = resnet_utils.stack_blocks_dense(
              proposal_feature_maps, blocks)
    return proposal_classifier_features
于 2019-04-15T09:32:17.447 回答