ocr - Tesseract 无法打开 .tr 文件问题

Question

我是 Tesseract OCR 的新手，我正在按照本教程为新语言创建新的训练数据，但遇到以下错误。我已经尝试并搜索了几天来克服这个问题，但找不到任何问题。

我正在使用 Windows、Tesseract v5.0.0（最新版本）和 jTessBoxEditor

这是我的命令提示符：

F:\My R&D Softwares\tutorial>tesseract newfn.fn1.exp0.tif newfn.fn1.exp0 nobatch box.train
Tesseract Open Source OCR Engine v5.0.0-alpha.20210506 with Leptonica
Page 1
APPLY_BOXES:
   Boxes read from boxfile:       4
   Found 4 good blobs.
Generated training data for 1 words
Page 2
APPLY_BOXES:
   Boxes read from boxfile:       4
   Found 4 good blobs.
Generated training data for 1 words
Page 3
APPLY_BOXES:
   Boxes read from boxfile:       4
   Found 4 good blobs.
Generated training data for 1 words

F:\My R&D Softwares\tutorial>unicharset_extractor newfn.fn1.exp0.box
Extracting unicharset from box file newfn.fn1.exp0.box
Wrote unicharset file unicharset

F:\My R&D Softwares\tutorial>shapeclustering -F font_properties.txt -U unicharset -O newfn.unicharset newfn.fn1.exp0.tr
Reading shapeclustering ...
Failed to open tr file: shapeclustering
Reading newfn.fn1.exp0.tr ...

F:\My R&D Softwares\tutorial>mftraining -F font_properties.txt -U unicharset -O newfn.unicharset newfn.fn1.exp0.tr echo Clustering..
Warning: No shape table file present: shapetable
Reading mftraining ...
Failed to open tr file: mftraining
Reading newfn.fn1.exp0.tr ...
Reading echo ...
Failed to open tr file: echo
Reading Clustering.. ...
Failed to open tr file: Clustering..
Flat shape table summary: Number of shapes = 9 max unichars = 1 number with multiple unichars = 0

F:\My R&D Softwares\tutorial>cntraining newfn.fn1.exp0.tr
Reading newfn.fn1.exp0.tr ...
Clustering ...

F:\My R&D Softwares\tutorial>

有了这个，我根本无法创建 normproto、intemp 和 pffmtable 文件。这是什么

无法打开 tr 文件：

错误，我该如何解决？（请注意 .tr 文件位于 cmd 中所述的相同位置）

以下是创建的 tr 文件的第一行： newfn.fn1.exp0.tr

ocr - Tesseract 无法打开 .tr 文件问题

0 回答 0

Related

Reference