我一直在按照本指南在 Sagemaker 上实施 Detectron2 模型。在训练和批量转换方面,一切看起来都不错。
但是,我尝试稍微调整代码以创建一个可以通过发送有效负载来调用的端点,但我遇到了一些麻烦。
在此笔记本的末尾,创建 SageMaker 模型对象后:
model = PyTorchModel(
name="d2-sku110k-model",
model_data=training_job_artifact,
role=role,
sagemaker_session=sm_session,
entry_point="predict_sku110k.py",
source_dir="container_serving",
image_uri=serve_image_uri,
framework_version="1.6.0",
code_location=f"s3://{bucket}/{prefix_code}",
)
我添加了以下代码:
predictor = model.deploy(initial_instance_count=1, instance_type='ml.m5.xlarge')
而且我可以看到模型已经成功部署。
但是,当我尝试使用以下内容预测图像时:
predictor.predict(input)
我收到以下错误:
ModelError:调用 InvokeEndpoint 操作时发生错误 (ModelError):从主服务器收到服务器错误 (500),消息为“类型 [application/x-npy] 尚不支持此类型 Traceback(最近一次调用):文件”/opt /conda/lib/python3.6/site-packages/sagemaker_inference/transformer.py”,第 126 行,在转换结果 = self._transform_fn(self._model, input_data, content_type, accept) 文件“/opt/conda/lib/ python3.6/site-packages/sagemaker_inference/transformer.py”,第 215 行,在 _default_transform_fn 数据 = self._input_fn(input_data, content_type) 文件“/opt/ml/model/code/predict_sku110k.py”,第 98 行,在input_fn raise ValueError(err_msg) ValueError: Type [application/x-npy] not support this type yet
我尝试了一堆不同的输入类型:图像字节编码(使用 cv2.imencode('.jpg', cv_img)[1].tobytes() 创建)、numpy 数组、BytesIO 对象(使用 io 模块创建) ,形式为 {'input': image} 的字典,其中 image 是以前的任何一个(这是因为我前段时间创建的 tensorflow 端点使用了这种格式)。
因为我认为这可能是相关的,所以我还在此处复制粘贴用作入口点的推理脚本:
"""Code used for sagemaker batch transform jobs"""
from typing import BinaryIO, Mapping
import json
import logging
import sys
from pathlib import Path
import numpy as np
import cv2
import torch
from detectron2.engine import DefaultPredictor
from detectron2.config import CfgNode
##############
# Macros
##############
LOGGER = logging.Logger("InferenceScript", level=logging.INFO)
HANDLER = logging.StreamHandler(sys.stdout)
HANDLER.setFormatter(logging.Formatter("%(levelname)s | %(name)s | %(message)s"))
LOGGER.addHandler(HANDLER)
##########
# Deploy
##########
def _load_from_bytearray(request_body: BinaryIO) -> np.ndarray:
npimg = np.frombuffer(request_body, np.uint8)
return cv2.imdecode(npimg, cv2.IMREAD_COLOR)
def model_fn(model_dir: str) -> DefaultPredictor:
r"""Load trained model
Parameters
----------
model_dir : str
S3 location of the model directory
Returns
-------
DefaultPredictor
PyTorch model created by using Detectron2 API
"""
path_cfg, path_model = None, None
for p_file in Path(model_dir).iterdir():
if p_file.suffix == ".json":
path_cfg = p_file
if p_file.suffix == ".pth":
path_model = p_file
LOGGER.info(f"Using configuration specified in {path_cfg}")
LOGGER.info(f"Using model saved at {path_model}")
if path_model is None:
err_msg = "Missing model PTH file"
LOGGER.error(err_msg)
raise RuntimeError(err_msg)
if path_cfg is None:
err_msg = "Missing configuration JSON file"
LOGGER.error(err_msg)
raise RuntimeError(err_msg)
with open(str(path_cfg)) as fid:
cfg = CfgNode(json.load(fid))
cfg.MODEL.WEIGHTS = str(path_model)
cfg.MODEL.DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
return DefaultPredictor(cfg)
def input_fn(request_body: BinaryIO, request_content_type: str) -> np.ndarray:
r"""Parse input data
Parameters
----------
request_body : BinaryIO
encoded input image
request_content_type : str
type of content
Returns
-------
np.ndarray
input image
Raises
------
ValueError
ValueError if the content type is not `application/x-image`
"""
if request_content_type == "application/x-image":
np_image = _load_from_bytearray(request_body)
else:
err_msg = f"Type [{request_content_type}] not support this type yet"
LOGGER.error(err_msg)
raise ValueError(err_msg)
return np_image
def predict_fn(input_object: np.ndarray, predictor: DefaultPredictor) -> Mapping:
r"""Run Detectron2 prediction
Parameters
----------
input_object : np.ndarray
input image
predictor : DefaultPredictor
Detectron2 default predictor (see Detectron2 documentation for details)
Returns
-------
Mapping
a dictionary that contains: the image shape (`image_height`, `image_width`), the predicted
bounding boxes in format x1y1x2y2 (`pred_boxes`), the confidence scores (`scores`) and the
labels associated with the bounding boxes (`pred_boxes`)
"""
LOGGER.info(f"Prediction on image of shape {input_object.shape}")
outputs = predictor(input_object)
fmt_out = {
"image_height": input_object.shape[0],
"image_width": input_object.shape[1],
"pred_boxes": outputs["instances"].pred_boxes.tensor.tolist(),
"scores": outputs["instances"].scores.tolist(),
"pred_classes": outputs["instances"].pred_classes.tolist(),
}
LOGGER.info(f"Number of detected boxes: {len(fmt_out['pred_boxes'])}")
return fmt_out
# pylint: disable=unused-argument
def output_fn(predictions, response_content_type):
r"""Serialize the prediction result into the desired response content type"""
return json.dumps(predictions)
谁能指出调用模型的正确格式(或如何调整代码以使用端点)?我正在考虑将 request_content_type 更改为“application/json”,但我不确定它是否会有很大帮助。
编辑:我尝试了一个受此SO 线程启发的解决方案,但它不适用于我的情况。