我的目标是将 PyTorch 模型转换为可用于在 Edge TPU 上进行推理的量化 tflite 模型。
我能够将一个相当复杂的深度估计模型从 PyTorch 转换为 tflite,并在 Edge TPU 上成功运行它。但由于并非所有操作都受支持,因此推理速度非常慢(>800 毫秒)。
Number of operations that will run on Edge TPU: 87
Number of operations that will run on CPU: 47
因为我想要一个完全在 TPU 上运行的模型,所以我尝试转换我能想到的最简单的模型,即 MobilenetV2 分类模型。但是在运行量化模型时,我得到了奇怪的不准确结果。
PyTorch | TFLite |
---|---|
萨摩耶:0.8303 | 导弹:0.184565 |
博美犬:0.06989 | 库瓦兹:0.184565 |
keeshond: 0.01296 | 佛塔:0.184565 |
牧羊犬:0.0108 | 萨摩耶:0.184565 |
大比利牛斯山脉:0.00989 | 北极狐:0.184565 |
这是由将模型从 float32 量化到 uint8 引起的,还是我做错了什么?如果它是由量化引起的,我该如何缓解呢?corral 的分类示例运行良好,据我所知,它使用相同的模型。
转换过程
PyTorch -> ONNX -> OpenVINO -> TensorFlow -> TensorFlowLite
我编写了自己的代码,将模型从 PyTorch 转换为 ONNX,从 TensorFlow(pd) 转换为 TFlite。对于其他转换步骤,我使用了OpenVINO mo.py 脚本和 openvino2tensorflow 收费,因为 PyTorch 和 TensorFlow 之间的 nchw nhwc 不匹配。
下载
深度估计模型:https ://github.com/AaronZettler/miscellaneous/blob/master/mobilenet_v2_depth_est.pth?raw=true
分类模型:https ://github.com/AaronZettler/miscellaneous/blob/master/mobilenetv2.tflite?raw=true
标签:https ://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt
图片:https ://raw.githubusercontent.com/pytorch/hub/master/images/dog.jpg
代码
此代码不需要运行 Edge TPU,但它确实需要谷歌珊瑚库。如果我对平均值和标准差使用不同的参数,例如 (2.0, 76.0),我会得到 dog.jpg 图像的可靠结果,但如果我尝试对其他内容进行分类,我也会遇到同样的问题。
import numpy as np
from PIL import Image
from pycoral.adapters import classify
from pycoral.adapters import common
from pycoral.utils.dataset import read_label_file
from torchvision import transforms
from tensorflow.lite.python.interpreter import Interpreter
def cropPIL(image, new_width, new_height):
width, height = image.size
left = (width - new_width)/2
top = (height - new_height)/2
right = (width + new_width)/2
bottom = (height + new_height)/2
return image.crop((left, top, right, bottom))
def softmax(x):
e_x = np.exp(x - np.max(x))
return e_x / e_x.sum()
def classify_img(image_dir, lables_dir, model_dir, mean, std):
#loading lables and model
labels = read_label_file(lables_dir)
interpreter = Interpreter(model_path=model_dir)
interpreter.allocate_tensors()
#load an resize image
size = (256, 256)
image = Image.open(image_dir).convert('RGB')
image = image.resize(((int)(size[0]*image.width/image.height), size[1]), Image.ANTIALIAS)
image = cropPIL(image, 224, 224)
image = np.asarray(image)
#normalizing the input image
params = common.input_details(interpreter, 'quantization_parameters')
scale = params['scales']
zero_point = params['zero_points']
normalized_input = (image - mean) / (std * scale) + zero_point
np.clip(normalized_input, 0, 255, out=normalized_input)
#setting the image as input
common.set_input(interpreter, normalized_input.astype(np.uint8))
#run inference
interpreter.invoke()
#get output tensor and run softmax
output_details = interpreter.get_output_details()[0]
output_data = interpreter.tensor(output_details['index'])().flatten()
scores = softmax(output_data.astype(float))
#get the top 10 classes
classes = classify.get_classes_from_scores(scores, 5, 0.0)
print('-------RESULTS--------')
for c in classes:
print('%s: %f' % (labels.get(c.id, c.id), c.score))
image_dir = 'data/dog.jpg'
lables_dir = 'data/imagenet_classes.txt'
model_dir = 'models/mobilenetv2.tflite'
classify_img(image_dir, lables_dir, model_dir, 114.0, 57.0)
要在 google colab 上运行 PyTorch 模型,我必须更换
model = torch.hub.load('pytorch/vision:v0.9.0', 'mobilenet_v2', pretrained=True)
和
model = torchvision.models.mobilenet_v2(pretrained=True)
让它工作。
这是我用来在我的机器上测试 PyTorch 模型的代码。
import torch
from PIL import Image
from torchvision import transforms
import torchvision
import numpy as np
import matplotlib.pyplot as plt
def inference(model, input_image, lables_dir):
preprocess = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0)
# move the input and model to GPU for speed if available
if torch.cuda.is_available():
input_batch = input_batch.to('cuda')
model.to('cuda')
with torch.no_grad():
output = model(input_batch)
probabilities = torch.nn.functional.softmax(output[0], dim=0)
# Read the categories
with open(lables_dir, "r") as f:
categories = [s.strip() for s in f.readlines()]
# Show top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
result = {}
for i in range(top5_prob.size(0)):
result[categories[top5_catid[i]]] = top5_prob[i].item()
return result
def classify(image_dir, lables_dir):
model = torchvision.models.mobilenet_v2(pretrained=True)
model.eval()
im = Image.open(image_dir)
results = inference(model, im, lables_dir)
for result in results:
print(f'{result}: {round(results[result], 5)}')
classify('data/dog.jpg', 'data/imagenet_classes.txt')