android - Is it possible to quantize a tflite model?

Question

I have a .pb model, which I want to use as a custom MLKit model. MLKit only supports .tflite models, but even after I use toco to get the TensorFlow Lite model, the file size is too large for Firebase (95 MB and only 40 MB allowed).

Is there a way to quantize the graph and then convert to TFLite or quantize a .tflite graph?

When I do the former, I get the following error message: Unsupported TensorFlow op: Dequantize) for which the quantized form is not yet implemented. Sorry, and patches welcome (that's a relatively fun patch to write, mostly providing the actual quantized arithmetic code for this op).

score 5 · Accepted Answer

由于您主要对减小模型大小感兴趣，因此可以将--optimizations=[tf.lite.Optimizations.DEFAULT]标志传递给 TOCO。这应该以 8 位存储权重并在推理期间进行去量化以进行浮点计算。根据模型和问题，这可能会影响准确性，因此请在结果模型上运行 eval 并确保其符合您所需的标准。

如果您想使用整数计算运行模型，您可以使用这些训练重写进行量化训练并将结果图与 TOCO 转换为 TFLite：https ://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/量化

这涉及更多，并且涉及在冻结并提供给 TOCO 之前对您的 tensorflow 模型进行一些重新训练。

android - Is it possible to quantize a tflite model?

1 回答 1

Related

Reference