python - Tensorflow 对象检测 API：如何使用不包含任何标签（硬底片）的图像创建 tfrecords？

Question

你好。

我目前在我自己的数据集上使用 tensorflow 对象检测 API（使用 Faster Rcnn），对于我的一些标签，我已经确定了很可能被检测为误报的对象，并且我知道该 API 使用硬示例挖掘，所以我正在尝试将包含这些硬物体的图像引入训练中，以便矿工可以将它们作为硬底片。

在 github https://github.com/tensorflow/models/issues/2544上进行此对话后，有人告诉我这是可能的

您可以拥有纯负图像，faster_rcnn 模型将从它们的锚点中采样。

所以我的问题是：我如何创建一些没有任何边界框的图像的 tfrecords？我在相关的 .xml 文件中放了什么？

score 1 · Accepted Answer

在您的 tfrecords 生成脚本中，确保您将硬负图像的元数据添加到 tf 记录中，如下所示 -

tf_example = tf.train.Example(features=tf.train.Features(feature={
            'image/height': dataset_util.int64_feature(height),
            'image/width': dataset_util.int64_feature(width),
            'image/filename': dataset_util.bytes_feature(filename),
            'image/source_id': dataset_util.bytes_feature(filename),
            'image/encoded': dataset_util.bytes_feature(encoded_jpg),
            'image/format': dataset_util.bytes_feature(image_format)
            }))

对于带有对象的图像，您还必须添加边界框和标签信息 -

tf_example = tf.train.Example(features=tf.train.Features(feature={
            'image/height': dataset_util.int64_feature(height),
            'image/width': dataset_util.int64_feature(width),
            'image/filename': dataset_util.bytes_feature(filename),
            'image/source_id': dataset_util.bytes_feature(filename),
            'image/encoded': dataset_util.bytes_feature(encoded_jpg),
            'image/format': dataset_util.bytes_feature(image_format),
            'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
            'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
            'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
            'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
            'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
            'image/object/class/label': dataset_util.int64_list_feature(classes),
        }))

还要确保min_negatives_per_image在 pipeline.config 文件中设置为正数，否则它不会使用负图像进行训练

score 0 · Accepted Answer

我调整了我的数据集，为没有任何实际注释的图像添加了一个虚拟注释，并将 tfrecord 生产者代码更改为：

def create_tf_example(group, path, label_map):
    with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
        encoded_jpg = fid.read()
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    width, height = image.size

    filename = group.filename.encode('utf8')
    image_format = b'jpg'

    xmins = []
    xmaxs = []
    ymins = []
    ymaxs = []
    classes_text = []
    classes = []

    for index, row in group.object.iterrows():
        if not pd.isnull(row.xmin):
            if not row.xmin == -1:
                xmins.append(row['xmin'] / width)
                xmaxs.append(row['xmax'] / width)
                ymins.append(row['ymin'] / height)
                ymaxs.append(row['ymax'] / height)
                classes_text.append(row['class'].encode('utf8'))
                classes.append(label_map[row['class']])

    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example

因此，当出现虚拟注释“xmin == -1”时，它会创建一个带有空边界框列表（类、xmin、xmax、ymin、ymax）的 tfrecord。

除了凌乱的火车损失行为之外，我的模型成功地学习了负样本模式，从而在我的场景中将我得到的误报减少到零。

score 0 · Accepted Answer

您不必为此做任何具体的事情。无论源格式是什么，只需将相关边界框列表留空即可。我这样做是为了我的实验，但没有得到任何收益。

python - Tensorflow 对象检测 API：如何使用不包含任何标签（硬底片）的图像创建 tfrecords？

3 回答 3

Related

Reference