tesseract - Tesseract 新字体训练失败

翻译自：https://stackoverflow.com/questions/58889736 2019-11-16T10:27:13.683

240 次

我正在将 Tesseract 库用于需要识别新字体的 OCR 项目。我在 youtube 上关注了本教程，并添加了一个新字体，即在波斯语中流行的“B Nazanin”；但我遇到了以下一些错误：

=== Starting training for language 'fas'
[‫شنبه ۱۶ نوامبر ۱۹، ساعت ۱۳:۳۴:۲۶ (+0330)‬] /usr/bin/text2image --fonts_dir=fonts --font=B Nazanin --outputbase=/tmp/font_tmp.0dLuk6X9KA/sample_text.txt --text=/tmp/font_tmp.0dLuk6X9KA/sample_text.txt --fontconfig_tmpdir=/tmp/font_tmp.0dLuk6X9KA
Stripped 1 unrenderable words
Error in boxaGetExtent: boxa not defined
Error in boxaAddBox: box not defined
Rendered page 0 to file /tmp/font_tmp.0dLuk6X9KA/sample_text.txt.tif
Rtl = 0 ,vertical=0

=== Phase I: Generating training images ===
Rendering using B Nazanin
[‫شنبه ۱۶ نوامبر ۱۹، ساعت ۱۳:۳۴:۲۷ (+0330)‬] /usr/bin/text2image --fontconfig_tmpdir=/tmp/font_tmp.0dLuk6X9KA --fonts_dir=fonts --strip_unrenderable_words --leading=32 --xsize=3600 --char_spacing=0.0 --exposure=0 --outputbase=/tmp/fas-2019-11-16.Upk/fas.B_Nazanin.exp0 --max_pages=10 --font=B Nazanin --text=../langdata_lstm/fas/fas.training_text
ERROR: Non-existent flag --max_pages=10
ERROR: Program text2image failed. Abort.

我已经搜索了这个错误，但没有成功。max_pages 参数设置正确，但没有理由不创建 box 文件。错误是什么以及如何解决？谢谢你的帮助。

tesseract - Tesseract 新字体训练失败

0 回答 0

Related

Reference