1

我的目标是将 PyTorch 模型转换为可用于在 Edge TPU 上进行推理的量化 tflite 模型。

我能够将一个相当复杂的深度估计模型从 PyTorch 转换为 tflite,并在 Edge TPU 上成功运行它。但由于并非所有操作都受支持,因此推理速度非常慢(>800 毫秒)。

Number of operations that will run on Edge TPU: 87
Number of operations that will run on CPU: 47

深度估计

因为我想要一个完全在 TPU 上运行的模型,所以我尝试转换我能想到的最简单的模型,即 MobilenetV2 分类模型。但是在运行量化模型时,我得到了奇怪的不准确结果。

PyTorch TFLite
萨摩耶:0.8303 导弹:0.184565
博美犬:0.06989 库瓦兹:0.184565
keeshond: 0.01296 佛塔:0.184565
牧羊犬:0.0108 萨摩耶:0.184565
大比利牛斯山脉:0.00989 北极狐:0.184565

这是由将模型从 float32 量化到 uint8 引起的,还是我做错了什么?如果它是由量化引起的,我该如何缓解呢?corral 的分类示例运行良好,据我所知,它使用相同的模型。

转换过程

PyTorch -> ONNX -> OpenVINO -> TensorFlow -> TensorFlowLite

我编写了自己的代码,将模型从 PyTorch 转换为 ONNX,从 TensorFlow(pd) 转换为 TFlite。对于其他转换步骤,我使用了OpenVINO mo.py 脚本和 openvino2tensorflow 收费,因为 PyTorch 和 TensorFlow 之间的 nchw nhwc 不匹配。

下载

深度估计模型:https ://github.com/AaronZettler/miscellaneous/blob/master/mobilenet_v2_depth_est.pth?raw=true

分类模型:https ://github.com/AaronZettler/miscellaneous/blob/master/mobilenetv2.tflite?raw=true

标签:https ://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt

图片:https ://raw.githubusercontent.com/pytorch/hub/master/images/dog.jpg

代码

此代码不需要运行 Edge TPU,但它确实需要谷歌珊瑚库。如果我对平均值和标准差使用不同的参数,例如 (2.0, 76.0),我会得到 dog.jpg 图像的可靠结果,但如果我尝试对其他内容进行分类,我也会遇到同样的问题。


import numpy as np
from PIL import Image
from pycoral.adapters import classify
from pycoral.adapters import common
from pycoral.utils.dataset import read_label_file
from torchvision import transforms

from tensorflow.lite.python.interpreter import Interpreter


def cropPIL(image, new_width, new_height):
    width, height = image.size

    left = (width - new_width)/2
    top = (height - new_height)/2
    right = (width + new_width)/2
    bottom = (height + new_height)/2

    return image.crop((left, top, right, bottom))

def softmax(x):
    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum()

def classify_img(image_dir, lables_dir, model_dir, mean, std):
    #loading lables and model
    labels = read_label_file(lables_dir)
    interpreter = Interpreter(model_path=model_dir)
    interpreter.allocate_tensors()
    
    #load an resize image
    size = (256, 256)
    image = Image.open(image_dir).convert('RGB')
    image = image.resize(((int)(size[0]*image.width/image.height), size[1]), Image.ANTIALIAS)
    image = cropPIL(image, 224, 224)
    image = np.asarray(image)

    #normalizing the input image
    params = common.input_details(interpreter, 'quantization_parameters')
    scale = params['scales']
    zero_point = params['zero_points']

    normalized_input = (image - mean) / (std * scale) + zero_point
    np.clip(normalized_input, 0, 255, out=normalized_input)

    #setting the image as input
    common.set_input(interpreter, normalized_input.astype(np.uint8))
    
    #run inference
    interpreter.invoke()

    #get output tensor and run softmax
    output_details = interpreter.get_output_details()[0]
    output_data = interpreter.tensor(output_details['index'])().flatten()
    scores = softmax(output_data.astype(float))

    #get the top 10 classes
    classes = classify.get_classes_from_scores(scores, 5, 0.0)

    print('-------RESULTS--------')
    for c in classes:
       print('%s: %f' % (labels.get(c.id, c.id), c.score))


image_dir  = 'data/dog.jpg'
lables_dir = 'data/imagenet_classes.txt'
model_dir  = 'models/mobilenetv2.tflite'

classify_img(image_dir, lables_dir, model_dir, 114.0, 57.0)

要在 google colab 上运行 PyTorch 模型,我必须更换

model = torch.hub.load('pytorch/vision:v0.9.0', 'mobilenet_v2', pretrained=True)

model = torchvision.models.mobilenet_v2(pretrained=True)

让它工作。

这是我用来在我的机器上测试 PyTorch 模型的代码。

import torch
from PIL import Image
from torchvision import transforms
import torchvision

import numpy as np
import matplotlib.pyplot as plt
    
def inference(model, input_image, lables_dir):
    preprocess = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ])

    input_tensor = preprocess(input_image)
    input_batch = input_tensor.unsqueeze(0)

    # move the input and model to GPU for speed if available
    if torch.cuda.is_available():
        input_batch = input_batch.to('cuda')
        model.to('cuda')

    with torch.no_grad():
        output = model(input_batch)

    probabilities = torch.nn.functional.softmax(output[0], dim=0)

    # Read the categories
    with open(lables_dir, "r") as f:
        categories = [s.strip() for s in f.readlines()]

    # Show top categories per image
    top5_prob, top5_catid = torch.topk(probabilities, 5)
    result = {}
    for i in range(top5_prob.size(0)):
        result[categories[top5_catid[i]]] = top5_prob[i].item()
    return result

def classify(image_dir, lables_dir):
    model = torchvision.models.mobilenet_v2(pretrained=True)
    model.eval()

    im = Image.open(image_dir)
    results = inference(model, im, lables_dir)
    for result in results:
        print(f'{result}: {round(results[result], 5)}')


classify('data/dog.jpg', 'data/imagenet_classes.txt')
4

1 回答 1

1

openvino2tensorflow v1.20.4PReLU (LeakyReLU)现在支持EdgeTPU 映射。但是,由于模型的尺寸很大,不可能将所有操作都映射到 EdgeTPU。因此,EdgeTPU 中不适合 RAM 的部分被卸载到 CPU 进行推理,这非常慢。在这种情况下,仅由 CPU 进行的推理速度要快 4 到 5 倍。EdgeTPU 不支持PReLU (LeakyReLU),所以必须替换操作。但是,openvino2tensorflow v1.20.4 会自动替换转换过程中的操作。

docker run --gpus all -it --rm \
-v `pwd`:/home/user/workdir \
pinto0309/openvino2tensorflow:latest

cd workdir

MODEL=depth_estimation_mbnv2

H=180
W=320
$INTEL_OPENVINO_DIR/deployment_tools/model_optimizer/mo.py \
--input_model ${MODEL}_${H}x${W}.onnx \
--data_type FP32 \
--output_dir ${H}x${W}/openvino/FP32
$INTEL_OPENVINO_DIR/deployment_tools/model_optimizer/mo.py \
--input_model ${MODEL}_${H}x${W}.onnx \
--data_type FP16 \
--output_dir ${H}x${W}/openvino/FP16
mkdir -p ${H}x${W}/openvino/myriad
${INTEL_OPENVINO_DIR}/deployment_tools/inference_engine/lib/intel64/myriad_compile \
-m ${H}x${W}/openvino/FP16/${MODEL}_${H}x${W}.xml \
-ip U8 \
-VPU_NUMBER_OF_SHAVES 4 \
-VPU_NUMBER_OF_CMX_SLICES 4 \
-o ${H}x${W}/openvino/myriad/${MODEL}_${H}x${W}.blob

openvino2tensorflow \
--model_path ${H}x${W}/openvino/FP32/${MODEL}_${H}x${W}.xml \
--output_saved_model \
--output_pb \
--output_no_quant_float32_tflite \
--output_weight_quant_tflite \
--output_float16_quant_tflite \
--output_integer_quant_tflite \
--string_formulas_for_normalization 'data / 255' \
--output_integer_quant_type 'uint8' \
--output_tfjs \
--output_coreml \
--output_tftrt
mv saved_model saved_model_${H}x${W}

openvino2tensorflow \
--model_path ${H}x${W}/openvino/FP32/${MODEL}_${H}x${W}.xml \
--output_saved_model \
--output_pb \
--output_edgetpu \
--string_formulas_for_normalization 'data / 255' \
--output_integer_quant_type 'uint8'
mv saved_model/model_full_integer_quant.tflite saved_model_${H}x${W}/model_full_integer_quant.tflite
mv saved_model/model_full_integer_quant_edgetpu.tflite saved_model_${H}x${W}/model_full_integer_quant_edgetpu.tflite

mv ${H}x${W}/openvino saved_model_${H}x${W}/openvino
mv ${MODEL}_${H}x${W}.onnx saved_model_${H}x${W}/${MODEL}_${H}x${W}.onnx


H=240
W=320
$INTEL_OPENVINO_DIR/deployment_tools/model_optimizer/mo.py \
--input_model ${MODEL}_${H}x${W}.onnx \
--data_type FP32 \
--output_dir ${H}x${W}/openvino/FP32
$INTEL_OPENVINO_DIR/deployment_tools/model_optimizer/mo.py \
--input_model ${MODEL}_${H}x${W}.onnx \
--data_type FP16 \
--output_dir ${H}x${W}/openvino/FP16
mkdir -p ${H}x${W}/openvino/myriad
${INTEL_OPENVINO_DIR}/deployment_tools/inference_engine/lib/intel64/myriad_compile \
-m ${H}x${W}/openvino/FP16/${MODEL}_${H}x${W}.xml \
-ip U8 \
-VPU_NUMBER_OF_SHAVES 4 \
-VPU_NUMBER_OF_CMX_SLICES 4 \
-o ${H}x${W}/openvino/myriad/${MODEL}_${H}x${W}.blob

openvino2tensorflow \
--model_path ${H}x${W}/openvino/FP32/${MODEL}_${H}x${W}.xml \
--output_saved_model \
--output_pb \
--output_no_quant_float32_tflite \
--output_weight_quant_tflite \
--output_float16_quant_tflite \
--output_integer_quant_tflite \
--string_formulas_for_normalization 'data / 255' \
--output_integer_quant_type 'uint8' \
--output_tfjs \
--output_coreml \
--output_tftrt
mv saved_model saved_model_${H}x${W}

openvino2tensorflow \
--model_path ${H}x${W}/openvino/FP32/${MODEL}_${H}x${W}.xml \
--output_saved_model \
--output_pb \
--output_edgetpu \
--string_formulas_for_normalization 'data / 255' \
--output_integer_quant_type 'uint8'
mv saved_model/model_full_integer_quant.tflite saved_model_${H}x${W}/model_full_integer_quant.tflite
mv saved_model/model_full_integer_quant_edgetpu.tflite saved_model_${H}x${W}/model_full_integer_quant_edgetpu.tflite

mv ${H}x${W}/openvino saved_model_${H}x${W}/openvino
mv ${MODEL}_${H}x${W}.onnx saved_model_${H}x${W}/${MODEL}_${H}x${W}.onnx
  • PReLU (LeakyReLU)Maximum (ReLU), Minimum, Mul, 从: 至:Add

    在此处输入图像描述
    在此处输入图像描述
  • EdgeTPU 模型
    在此处输入图像描述
    在此处输入图像描述
于 2021-09-13T04:29:44.267 回答