我正在使用 TensorRT 和 cupy。如果我将cp.cuda.Stream(non_blocking=True)
while 设置为与non_blocking=False
. 为什么不应该与它一起使用non_blocking=True
?我检查了输入数据,没问题。但是代码最终以我的模型返回随机检测(随机数据),这意味着存在一些同步问题。
# Select stream
stream.use()
# Copy cupy array to the buffer
input_images = cp.array(batch_input_image)
cp.copyto(cuda_inputs[0], input_images)
# Run inference.
context.execute_async(bindings=bindings, stream_handle=stream.ptr, batch_size=len(batch_input_image))
# Copy results from the buffer
output_images = cuda_outputs[0].copy()
# Split results into batch
list_output = cp.split(output_images, indices_or_sections=len(batch_input_image), axis=0)
# Squeeze output arrays to remove axis of length one
list_output = [cp.squeeze(array) for array in list_output]
# Synchronize the stream
stream.synchronize()