tesseract - Error using text2image Font Exocet Light failed with 223518 hits = 99.94% when trying to build image file using Diablo 2 font

Question

I am running tesseract on windows 11 using the command prompt.

The text file is my training data. Words that I want to turn into images. The output is the next step in the Tesseract process for training my font. I am saying find fonts but I only have one font in the folder.

text2image --text="C:\PythonProjects\DiabloTesseractTrainFont\text.txt" --outputbase="C:\PythonProjects\DiabloTesseractTrainFont\Output\Dia.font.exp0" --fontconfig_tmpdir="C:\PythonProjects\DiabloTesseractTrainFont" --find_fonts --fonts_dir="C:\PythonProjects\DiabloTesseractTrainFont\Diablo Fonts"

The result: Total chars = 223645 Font Exocet Light failed with 223518 hits = 99.94%

Not sure why it fails. I have built something similar to this before. I have tried with a font file that I know has worked and it does the exact same thing.

Any help would be appreciated.

score 0 · Accepted Answer

我解决了。在文本文件中，当我将它们读入 python 时，有一些字符已被更改。我相信它们曾经是要点，但是当我阅读我用 python ASCII 编码实现的文件并忽略错误时。我认为这些字符将被删除。我错了。这些项目符号被替换为 PAD 的文本。我在记事本++中找到它并突出显示其中一个，然后用空格替换它们。请注意，当我进行替换时，Notepad++ 中的查找字段中没有任何内容，但它仍然替换了所有内容。现在它编译得很好。我被困了好几个小时，希望这对某人有所帮助。

tesseract - Error using text2image Font Exocet Light failed with 223518 hits = 99.94% when trying to build image file using Diablo 2 font

1 回答 1

Related

Reference