0

我有一个读取序列化TensorRT引擎的代码:

import tensorrt as trt
import pycuda.driver as cuda

cuda.init()
device = cuda.Device(0)
context = device.make_context()
logger = trt.Logger(trt.Logger.INFO)
with trt.Runtime(logger) as runtime:
    with open('model.trt', 'rb') as in_:
        engine = runtime.deserialize_cuda_engine(in_.read())

在我的 Nvidia Jeston Nano 上运行得很好,直到我编译它Pyinstaller

pyinstaller temp.py

在编译的代码中runtime.deserialize_cuda_engine返回 None 并且记录器说:

Cuda Error in loadKernel: 3 (initialization error)
[TensorRT] ERROR: INVALID_STATE: std::exception
[TensorRT] ERROR: INVALID_CONFIG: Deserialize the cuda engine failed.

当我从头开始构建引擎时,比如

cuda.init()
device = cuda.Device(0)
context = device.make_context()
logger = trt.Logger(trt.Logger.INFO)
with ExitStack() as stack:
    builder = stack.enter_context(trt.Builder(logger))
    network = stack.enter_context(builder.create_network(
        1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
    ))

    i = network.add_input('input0', trt.float16, (3, 2))
    s = network.add_softmax(i)
    network.mark_output(s.get_output(0))
    config = stack.enter_context(builder.create_builder_config())
    ...some builder settings like opt profiles and fp16 mode...
    engine = builder.build_engine(network, config)

然后一切正常,即使在编译之后也是如此。

引擎是trtexec在同一台计算机上准备的。Cuda版本是V10.2.89pycuda版本是2019.1.2。我相信这是截至 2020 年 8 月的标准 jetson 安装。

任何想法可能涉及这里以及可能的解决方法是什么?

4

0 回答 0