问题标签 [text2image]

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

0 投票
1 回答
903 浏览

ocr - Tesseract 训练 - text2image 每次在 Ubuntu 上返回分段错误

我正在尝试按照官方教程来训练一种新语言,但我无法执行“生成训练图像和框文件/准备文本文件”中的步骤。我已经创建了我的文本文件,但每次运行命令text2image --text=training_text.txt --outputbase=eng.TimesNewRomanBold.exp0 --font='Times New Roman Bold' --fonts_dir=/usr/share/fonts时结果都是Could not find font named Times New Roman Bold. Pango suggested font FreeSerif Bold Please correct --font arg.:Error:Assert failed:in file text2image.cpp, line 437 Segmentation fault (core dumped) .

这发生在给定的示例中(我使用了他们在教程中使用的那个)以及我选择的存在于 running 显示的列表中的每种字体text2image --text=training_text.txt --outputbase=eng --fonts_dir=/usr/share/fonts --find_fonts --min_coverage=1.0 --render_per_font=false

有人可以帮我弄这个吗?由于这个原因,我无法进一步了解本教程......

谢谢!

0 投票
0 回答
896 浏览

fonts - 列出所有系统的字体 Tesseract OCR Text2Image

我在 Windows 10 上使用 tesseract OCR,到目前为止,我能够一次为一种字体创建 .box 和 .tif 文件,但是当我尝试按照 github 网站上的描述制作 fontslist 时,它不起作用,给了我警告

警告:找不到用于呈现图像标题的字体!

它对每种字体都给出了失败,例如:

字体 Aldhabi 以 62 次点击失败 = 21.60%

也给'%'(U + 25)不被字体覆盖,但我不知道这是什么意思,无论如何我使用的命令是: text2image --text=training_text.txt --outputbase=eng.fontlist .txt --fonts_dir=C:\Windows\Fonts --find_fonts --min_coverage=1.0 --render_per_font=false --fontconfig_tmpdir=C:\Tesseract\Tesseract-OCR

知道如何解决这个错误吗?

提前致谢

0 投票
1 回答
12 浏览

tesseract - Error using text2image Font Exocet Light failed with 223518 hits = 99.94% when trying to build image file using Diablo 2 font

I am running tesseract on windows 11 using the command prompt.

The text file is my training data. Words that I want to turn into images. The output is the next step in the Tesseract process for training my font. I am saying find fonts but I only have one font in the folder.

text2image --text="C:\PythonProjects\DiabloTesseractTrainFont\text.txt" --outputbase="C:\PythonProjects\DiabloTesseractTrainFont\Output\Dia.font.exp0" --fontconfig_tmpdir="C:\PythonProjects\DiabloTesseractTrainFont" --find_fonts --fonts_dir="C:\PythonProjects\DiabloTesseractTrainFont\Diablo Fonts"

The result: Total chars = 223645 Font Exocet Light failed with 223518 hits = 99.94%

Not sure why it fails. I have built something similar to this before. I have tried with a font file that I know has worked and it does the exact same thing.

Any help would be appreciated.