9

我们正在尝试使用 deeplabv3 和 mobilenetv2 在 android 上运行语义分割模型。我们在 bazel 的帮助下使用 TOCO 和 tflite_convert 遵循官方 tensorflow lite 转换过程。源冻结图来自官方 TensorFlow DeepLab Model Zoo。

我们能够使用以下命令成功转换模型:-

CUDA_VISIBLE_DEVICES="0" toco --output_file=toco256.tflite --graph_def_file=path/to/deeplab/deeplabv3_mnv2_pascal_trainval/frozen_inference_graph.pb --input_arrays=ImageTensor --output_arrays=SemanticPredictions --input_shapes=1,256,256,3 --inference_input_type=QUANTIZED_UINT8 --inference_type=FLOAT --mean_values=128 --std_dev_values=127 --allow_custom_ops --post_training_quantize

tflite 文件的大小约为 2.25 Mb。但是当我们尝试使用官方基准工具测试模型时,它失败并出现以下错误报告:-

bazel run -c opt tensorflow/contrib/lite/tools/benchmark:benchmark_model -- --graph=`realpath toco256.tflite`
INFO: Analysed target //tensorflow/contrib/lite/tools/benchmark:benchmark_model (0 packages loaded).
INFO: Found 1 target...
Target //tensorflow/contrib/lite/tools/benchmark:benchmark_model up-to-date:
  bazel-bin/tensorflow/contrib/lite/tools/benchmark/benchmark_model
INFO: Elapsed time: 0.154s, Critical Path: 0.00s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/tensorflow/contrib/lite/tools/benchmark/benchmark_model '--graph=path/to/deeplab/venINFO: Build completed successfully, 1 total action
STARTING!
Num runs: [50]
Inter-run delay (seconds): [-1]
Num threads: [1]
Benchmark name: []
Output prefix: []
Warmup runs: [1]
Graph: path/to/venv/tensorflow/toco256.tflite]
Input layers: []
Input shapes: []
Use nnapi : [0]
Loaded model path/to/venv/tensorflow/toco256.tflite
resolved reporter
Initialized session in 45.556ms
Running benchmark for 1 iterations 
tensorflow/contrib/lite/kernels/pad.cc:96 op_context.dims != 4 (3 != 4)
Node number 24 (PAD) failed to prepare.

Failed to invoke!
Aborted (core dumped)

我们还尝试了相同的命令,但不包含“allow_custom_ops”和“post_training_quantize”选项,甚至使用了与 1,513,513,3 相同的输入大小;但结果是一样的。

这个问题似乎类似于以下 github 问题:(https://github.com/tensorflow/tensorflow/issues/21266)。然而,在最新版本的 TensorFlow 中,该问题应该已得到修复。

型号: http: //download.tensorflow.org/models/deeplabv3_mnv2_pascal_trainval_2018_01_29.tar.gz Tensorflow 版本:1.11 Bazel 版本:0.17.2 操作系统:Ubuntu 18.04

此外,android 应用程序无法正确加载模型(tflite 解释器)

那么,我们如何才能将分割模型正确地转换为可用于在 android 设备上进行推理的 tflite 格式呢?

更新:-

使用 tensorflow 1.12,我们得到了一个新的错误:

$ bazel run -c opt tensorflow/lite/tools/benchmark:benchmark_model -- --graph=`realpath /path/to/research/deeplab/venv/tensorflow/toco256.tflite`

    tensorflow/lite/kernels/depthwise_conv.cc:99 params->depth_multiplier * SizeOfDimension(input, 3) != SizeOfDimension(filter, 3) (0 != 32)
    Node number 30 (DEPTHWISE_CONV_2D) failed to prepare.

此外,在使用来自 tensorflow deeplab模型动物园的 depth_multiplier=0.5 的同一模型(3 Mb .pb 文件)的较新版本时,我们得到了一个不同的错误:-

F tensorflow/lite/toco/graph_transformations/propagate_fixed_sizes.cc:116] Check failed: dim_x == dim_y (3 vs. 32)Dimensions must match

在这种情况下,我们使用上述相同的命令进行 tflite 转换;但我们甚至无法生成“tflite”文件作为输出。这似乎是深度乘数值的问题。(即使我们尝试将 depth_multiplier 参数作为参数转换时)。

4

2 回答 2

5

我也遇到了这个问题。转换中似乎有2个问题:

  • 输入张量具有动态形状,即 [?,?,?,3]
  • pad_to_bounding_box 节点部分不会自动转换为静态形状

对于下面的解决方案,这是在以下环境中测试的:

  • 张量流 1.15
  • Ubuntu 16.0.4

解决方案

我假设您已经使用 deeplab 文件夹中的 export_model.py 文件创建了一个 .pb 文件,并将该文件命名为 deeplab_mobilenet_v2.pb。从这里开始:

第 1 步:优化推理

python3 optimize_for_inference.py \
        --input "path/to/your/deeplab_mobilenet_v2.pb" \
        --output "path/to/deeplab_mobilenet_v2_opt.pb" \
        --frozen_graph True \
        --input_names ImageTensor \
        --output_names SemanticPredictions \
        --placeholder_type_enum=4

placeholder_type_enum=4 是 uint8 数据类型(dtypes.uint8.as_datatype_enum)

第 2 步:应用图形变换工具

确保您已经安装了 bazel 并从 github 下载了 tensorflow r1.15 分支。然后从 tensorflow repo 制作 transform_graph 工具:

bazel build tensorflow/tools/graph_transforms:transform_graph

然后运行 ​​transform_graph 工具(确保将形状设置为您用作输入的任何形状):

bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph="/path/to/deeplab_mobilenet_v2_opt.pb" \
--out_graph="/path/to/deeplab_mobilenet_v2_opt_flatten.pb" \
--inputs='ImageTensor' \
--outputs='SemanticPredictions' \
--transforms='
    strip_unused_nodes(type=quint8, shape="1,400,225,3")
    flatten_atrous_conv
    fold_constants(ignore_errors=true, clear_output_shapes=false)
    fold_batch_norms
    fold_old_batch_norms
    remove_device
    sort_by_execution_order'

第 3 步:绕过 pad_to_bounding_box 节点并使输入成为静态

运行下面的 python 文件,确保将 model_filepath、save_folder 和 save_name 更改为适合您需要的任何内容。

import tensorflow as tf
import numpy as np
from tensorflow.contrib import graph_editor as ge

def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True):
    """
    Freezes the state of a session into a pruned computation graph.

    Creates a new computation graph where variable nodes are replaced by
    constants taking their current value in the session. The new graph will be
    pruned so subgraphs that are not necessary to compute the requested
    outputs are removed.
    @param session The TensorFlow session to be frozen.
    @param keep_var_names A list of variable names that should not be frozen,
                          or None to freeze all the variables in the graph.
    @param output_names Names of the relevant graph outputs.
    @param clear_devices Remove the device directives from the graph for better portability.
    @return The frozen graph definition.
    """
    graph = session.graph
    with graph.as_default():
        freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or []))
        output_names = output_names or []
        output_names += [v.op.name for v in tf.global_variables()]
        input_graph_def = graph.as_graph_def()
        if clear_devices:
            for node in input_graph_def.node:
                node.device = ""
        frozen_graph = tf.graph_util.convert_variables_to_constants(
            session, input_graph_def, output_names, freeze_var_names)
        return frozen_graph

def load_convert_save_graph(model_filepath, save_folder, save_name):
    '''
    Lode trained model.
    '''
    print('Loading model...')
    graph = tf.Graph()
    sess = tf.InteractiveSession(graph = graph)

    with tf.gfile.GFile(model_filepath, 'rb') as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())

    print('Check out the input placeholders:')
    nodes = [n.name + ' => ' +  n.op for n in graph_def.node if n.op in ('Placeholder')]
    for node in nodes:
        print(node)

    # Define input tensor
    input = tf.placeholder(np.uint8, shape = [1,400,225,3], name='ImageTensor')

    tf.import_graph_def(graph_def, {'ImageTensor': input}, name='')

    print('Model loading complete!')

    # remove the pad to bounding box node
    name = "pad_to_bounding_box"
    print(name)
    sgv = ge.make_view_from_scope(name, tf.get_default_graph())
    print("\t" + sgv.inputs[0].name)
    for node in sgv.inputs:
        print("name in = " + node.name)
    for node in sgv.outputs:
        print("name out = " + node.name)
    print("\t" + sgv.outputs[len(sgv.outputs)-1].name)
    sgv = sgv.remap_inputs([0])
    sgv = sgv.remap_outputs([len(sgv.outputs)-1])
    (sgv2, det_inputs) = ge.bypass(sgv)


    frozen_graph = freeze_session(sess,
                              output_names=['SemanticPredictions'])
    tf.train.write_graph(frozen_graph, save_folder, save_name, as_text=False)


load_convert_save_graph("path/to/deeplab_mobilenet_v2_opt_flatten.pb", "/path/to", "deeplab_mobilenet_v2_opt_flatten_static.pb")

第 4 步:转换为 TFLITE

tflite_convert \
  --graph_def_file="/path/to/deeplab_mobilenet_v2_opt_flatten_static.pb" \
  --output_file="/path/to/deeplab_mobilenet_v2_opt_flatten_static.tflite" \
  --output_format=TFLITE \
  --input_shape=1,400,225,3 \
  --input_arrays="ImageTensor" \
  --inference_type=FLOAT \
  --inference_input_type=QUANTIZED_UINT8 \
  --std_dev_values=128 \
  --mean_values=128 \
  --change_concat_input_ranges=true \
  --output_arrays="SemanticPredictions" \
  --allow_custom_ops

完毕

您现在可以运行您的 tflite 模型

于 2019-11-29T16:35:05.417 回答
0

我有同样的问题。从https://github.com/tantara/JejuNet我看到他成功地将模型转换为 tflite。我PM他寻求帮助,但不幸的是现在没有回应。

于 2018-11-23T01:25:08.960 回答