deep-learning - 如何在 NVIDIA Jetson Nano 上将 U-Net 分割模型转换为 TensorRT？（进程终止错误）

Question

我用 Keras 训练了一个 U-Net 分割模型（使用 TF 后端）。我正在尝试在 Jetson Nano 上将其冻结图 (.pb) 转换为 TensorRT 格式，但该过程被终止（如下所示）。我在其他帖子上看到它可能与“内存不足”问题有关。众所周知，我已经有一个在 Jetson Nano 上运行的 SSD MobileNet V2 模型。

如果我停止 systemctl，我可以使用 U-Net 模型进行推理，而无需将其转换为 TensorRT（仅使用加载了 Tensorflow 的冻结图模型）。由于这种方式在我启动 systemctl 时不起作用（所以当另一个神经网络正在运行时），我尝试将我的 U-Net 分割模型转换为 TensorRT 以获得它的优化版本（由于进程被杀死而失败)，但这可能不是正确的方法。

是否可以在 Jetson Nano 上运行两个神经网络？有没有其他方法可以做到这一点？

有关信息，这是我尝试将冻结图转换为 TensorRT 的方式：

trt_graph = trt.create_inference_graph(
    input_graph_def=frozen_graph_gd, # Pass the parsed graph def here
    outputs=['conv2d_24/Sigmoid'],
    max_batch_size=1,
    max_workspace_size_bytes=1 << 32, # I have tried 25 and 32 here
    precision_mode='FP16'
)

这是进程被终止的时间（将 U-Net 冻结图转换为 TensorRT）：

2020-10-05 16:00:58.200269: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2

WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.

2020-10-05 16:01:11.976893: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libnvinfer.so.7

2020-10-05 16:01:11.994472: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libnvinfer_plugin.so.7

WARNING:tensorflow:

The TensorFlow contrib module will not be included in TensorFlow 2.0.

For more information, please see:

* https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md

* https://github.com/tensorflow/addons

* https://github.com/tensorflow/io (for I/O related ops)

If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From convert_pb_to_tensorrt.py:14: The name tf.GraphDef is deprecated. Please use tf.compat.v1.GraphDef instead.

2020-10-05 16:01:13.678101: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libnvinfer.so.7

2020-10-05 16:01:15.506432: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1

2020-10-05 16:01:15.512224: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero

2020-10-05 16:01:15.512359: I tensorflow/core/grappler/devices.cc:55] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0

2020-10-05 16:01:15.512638: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session

2020-10-05 16:01:15.532712: W tensorflow/core/platform/profile_utils/cpu_utils.cc:98] Failed to find bogomips in /proc/cpuinfo; cannot determine CPU frequency

2020-10-05 16:01:15.533264: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x328fd900 initialized for platform Host (this does not guarantee that XLA will be used). Devices:

2020-10-05 16:01:15.533318: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version

2020-10-05 16:01:15.632451: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero

2020-10-05 16:01:15.632757: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x30d0edb0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:

2020-10-05 16:01:15.632808: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA Tegra X1, Compute Capability 5.3

2020-10-05 16:01:15.633163: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero

2020-10-05 16:01:15.633276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1634] Found device 0 with properties: 

name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216

pciBusID: 0000:00:00.0

2020-10-05 16:01:15.633348: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2

2020-10-05 16:01:15.633500: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10

2020-10-05 16:01:15.716786: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10

2020-10-05 16:01:15.903326: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10

2020-10-05 16:01:16.060655: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10

2020-10-05 16:01:16.141950: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10

2020-10-05 16:01:16.142219: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8

2020-10-05 16:01:16.142553: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero

2020-10-05 16:01:16.142878: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero

2020-10-05 16:01:16.142991: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1762] Adding visible gpu devices: 0

2020-10-05 16:01:16.143133: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2

2020-10-05 16:01:27.700226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1175] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-10-05 16:01:27.700377: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] 0 

2020-10-05 16:01:27.700417: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1194] 0: N 

2020-10-05 16:01:27.713559: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero

2020-10-05 16:01:27.713897: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero

2020-10-05 16:01:27.714101: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1320] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 200 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)

Killed

score 0 · Accepted Answer

如果模型有不受支持的层，则无法转换为张量 RT。如果是这种情况，使用 tensorflow 的版本或 TRT 可以产生结果，因为此版本可以很好地处理不受支持的层（它们将由 tensorflow 与您的 tensorflow 转换层一起处理）。

希望答案接近您的问题。Tensor rt 是一个混乱的生态系统

deep-learning - 如何在 NVIDIA Jetson Nano 上将 U-Net 分割模型转换为 TensorRT？（进程终止错误）

1 回答 1

Related

Reference