tensorflow - Tensorflow Object-API：将 ssd 模型转换为 tflite 并在 python 中使用

Question

我很难将给定的 tensorflow 模型转换为 tflite 模型然后使用它。我已经发布了一个问题，我在其中描述了我的问题，但没有分享我正在使用的模型，因为我不允许这样做。由于我没有以这种方式找到答案，因此我尝试转换公共模型（ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu）。

这是来自对象检测 api的 colab 教程。我只是运行整个脚本而不做任何更改（它的模型相同）并下载了生成的模型（有和没有元数据）。我将它们与 coco17 火车数据集中的示例图片一起上传到这里。

我尝试直接在 python 中使用这些模型，但结果感觉像垃圾。

这是我使用的代码，我遵循了本指南。我更改了 rects、scores 和 classes 的索引，因为否则结果的格式不正确。

#interpreter = tf.lite.Interpreter("original_models/model.tflite")
interpreter = tf.lite.Interpreter("original_models/model_with_metadata.tflite")

interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

size = 640

def draw_rect(image, box):
    y_min = int(max(1, (box[0] * size)))
    x_min = int(max(1, (box[1] * size)))
    y_max = int(min(size, (box[2] * size)))
    x_max = int(min(size, (box[3] * size)))
    
    # draw a rectangle on the image
    cv2.rectangle(image, (x_min, y_min), (x_max, y_max), (255, 255, 255), 2)

file = "images/000000000034.jpg"


img = cv2.imread(file)
new_img = cv2.resize(img, (size, size))
new_img = cv2.cvtColor(new_img, cv2.COLOR_BGR2RGB)

interpreter.set_tensor(input_details[0]['index'], [new_img.astype("f")])

interpreter.invoke()
rects = interpreter.get_tensor(
    output_details[1]['index'])

scores = interpreter.get_tensor(
    output_details[0]['index'])

classes = interpreter.get_tensor(
    output_details[3]['index'])


for index, score in enumerate(scores[0]):
        draw_rect(new_img,rects[0][index])
        #print(rects[0][index])
        print("scores: ",scores[0][index])
        print("class id: ", classes[0][index])
        print("______________________________")


cv2.imshow("image", new_img)
cv2.waitKey(0)
cv2.destroyAllWindows()

这导致以下控制台输出

scores:  0.20041436
class id:  51.0
______________________________
scores:  0.08925027
class id:  34.0
______________________________
scores:  0.079722285
class id:  34.0
______________________________
scores:  0.06676647
class id:  71.0
______________________________
scores:  0.06626186
class id:  15.0
______________________________
scores:  0.059938848
class id:  86.0
______________________________
scores:  0.058229476
class id:  34.0
______________________________
scores:  0.053791136
class id:  37.0
______________________________
scores:  0.053478718
class id:  15.0
______________________________
scores:  0.052847564
class id:  43.0
______________________________

和生成的图像

.

我尝试了来自 orinal 训练数据集的不同图像，但从未得到好的结果。我认为输出层坏了，或者可能缺少一些后处理？

我还尝试使用官方 tensorflow 文档中给出的转换方法。

import tensorflow as tf

saved_model_dir = 'tf_models/ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8/saved_model/'
    # Convert the model
    converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir) # path to the SavedModel directory
tflite_model = converter.convert()
    
# Save the model.
with open('model.tflite', 'wb') as f:
      f.write(tflite_model)

但是当我尝试使用该模型时，我得到了一个ValueError: Cannot set tensor: Dimension mismatch. Got 640 but expected 1 for dimension 1 of input 0.

有谁知道我做错了什么？

更新：在 Farmmakers 建议之后，我尝试更改最后由短脚本生成的模型的输入尺寸。之前的形状是：

[{'name': 'serving_default_input_tensor:0',
  'index': 0,
  'shape': array([1, 1, 1, 3], dtype=int32),
  'shape_signature': array([ 1, -1, -1,  3], dtype=int32),
  'dtype': numpy.uint8,
  'quantization': (0.0, 0),
  'quantization_parameters': {'scales': array([], dtype=float32),
   'zero_points': array([], dtype=int32),
   'quantized_dimension': 0},
  'sparsity_parameters': {}}]

所以增加一维是不够的。因此我使用了interpreter.resize_tensor_input(0, [1,640,640,3]). 现在它可以通过网络提供图像。

不幸的是，我无法理解输出。这是输出详细信息的打印：

[{'name': 'StatefulPartitionedCall:6',
  'index': 473,
  'shape': array([    1, 51150,     4], dtype=int32),
  'shape_signature': array([    1, 51150,     4], dtype=int32),
  'dtype': numpy.float32,
  'quantization': (0.0, 0),
  'quantization_parameters': {'scales': array([], dtype=float32),
   'zero_points': array([], dtype=int32),
   'quantized_dimension': 0},
  'sparsity_parameters': {}},
 {'name': 'StatefulPartitionedCall:0',
  'index': 2233,
  'shape': array([1, 1], dtype=int32),
  'shape_signature': array([ 1, -1], dtype=int32),
  'dtype': numpy.float32,
  'quantization': (0.0, 0),
  'quantization_parameters': {'scales': array([], dtype=float32),
   'zero_points': array([], dtype=int32),
   'quantized_dimension': 0},
  'sparsity_parameters': {}},
 {'name': 'StatefulPartitionedCall:5',
  'index': 2198,
  'shape': array([1], dtype=int32),
  'shape_signature': array([1], dtype=int32),
  'dtype': numpy.float32,
  'quantization': (0.0, 0),
  'quantization_parameters': {'scales': array([], dtype=float32),
   'zero_points': array([], dtype=int32),
   'quantized_dimension': 0},
  'sparsity_parameters': {}},
 {'name': 'StatefulPartitionedCall:7',
  'index': 493,
  'shape': array([    1, 51150,    91], dtype=int32),
  'shape_signature': array([    1, 51150,    91], dtype=int32),
  'dtype': numpy.float32,
  'quantization': (0.0, 0),
  'quantization_parameters': {'scales': array([], dtype=float32),
   'zero_points': array([], dtype=int32),
   'quantized_dimension': 0},
  'sparsity_parameters': {}},
 {'name': 'StatefulPartitionedCall:1',
  'index': 2286,
  'shape': array([1, 1, 1], dtype=int32),
  'shape_signature': array([ 1, -1, -1], dtype=int32),
  'dtype': numpy.float32,
  'quantization': (0.0, 0),
  'quantization_parameters': {'scales': array([], dtype=float32),
   'zero_points': array([], dtype=int32),
   'quantized_dimension': 0},
  'sparsity_parameters': {}},
 {'name': 'StatefulPartitionedCall:2',
  'index': 2268,
  'shape': array([1, 1], dtype=int32),
  'shape_signature': array([ 1, -1], dtype=int32),
  'dtype': numpy.float32,
  'quantization': (0.0, 0),
  'quantization_parameters': {'scales': array([], dtype=float32),
   'zero_points': array([], dtype=int32),
   'quantized_dimension': 0},
  'sparsity_parameters': {}},
 {'name': 'StatefulPartitionedCall:4',
  'index': 2215,
  'shape': array([1, 1], dtype=int32),
  'shape_signature': array([ 1, -1], dtype=int32),
  'dtype': numpy.float32,
  'quantization': (0.0, 0),
  'quantization_parameters': {'scales': array([], dtype=float32),
   'zero_points': array([], dtype=int32),
   'quantized_dimension': 0},
  'sparsity_parameters': {}},
 {'name': 'StatefulPartitionedCall:3',
  'index': 2251,
  'shape': array([1, 1, 1], dtype=int32),
  'shape_signature': array([ 1, -1, -1], dtype=int32),
  'dtype': numpy.float32,
  'quantization': (0.0, 0),
  'quantization_parameters': {'scales': array([], dtype=float32),
   'zero_points': array([], dtype=int32),
   'quantized_dimension': 0},
  'sparsity_parameters': {}}]

我将这样生成的 tflite 模型添加到了google drive中。

Update2：我在google 驱动器中添加了一个目录，其中包含一个使用全尺寸模型并产生正确输出的笔记本。如果您执行整个笔记本，它应该会在您的磁盘上生成以下图像。

score 1 · Accepted Answer

要使来自对象检测 API 的模型与 TFLite 良好配合，您必须将其转换为具有自定义操作的 TFLite 友好图。

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_on_mobile_tf2.md

（TF1 文档）

您也可以尝试使用TensorFlow Lite Model Maker

score 1 · Accepted Answer

我遵循了您显示的确切程序（tensorflow doc中提到的标准程序）。

首先，tflite 模型返回的输出，与官方文档中描述的不同，具有不同的格式（不同的索引）。

  boxes = get_output_tensor(interpreter, 1)
  classes = get_output_tensor(interpreter, 3)
  scores = get_output_tensor(interpreter, 0)
  count = int(get_output_tensor(interpreter, 2))

其次，重新调整的边界框的数量始终为 10，我不知道如何将其更改为数据集中的自定义对象数量。

最后，我解决它的方法是使用索引 1 检索边界框，并使用分数将它们过滤掉。但是，我得到的结果与原始模型相差甚远。此外，tflite 模型比原始模型花费更多时间，这与 tflite 的用途相反。可能是因为我在笔记本电脑上运行它，因此是 x86 指令集（tflite 已优化为在 ARM CPU 上运行（移动设备，树莓派））。

tensorflow - Tensorflow Object-API：将 ssd 模型转换为 tflite 并在 python 中使用

2 回答 2

Related

Reference