I trained and quantized a Tensorflow model on a Ubuntu 18.04 machine and I converted it to tflite format. Then I deployed it on a Linux Yocto board equipped with a NPU accelerator, tflite_runtime and NNAPI. I noticed that the same tflite model outputs different predictions when using the CPU on my PC and the NPU+NNAPI on the board for inference. The predictions are often similar, but in some cases they are completely different. I tried to disable NNAPI on the board and to make inference using the CPU and the results were the same as on the PC CPU. So I think that the problem is the NNAPI. However, I don't know why this happens. Is there a way to prevent it or to make the network more robust during training?
问问题
118 次