问题标签 [python-tesseract]

问问题

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

1353 问题

0 投票

0 回答

708 浏览

c++ - Finding the bounding box of the glyph in tesseract

I was going through the c++ API part of tesseract and found this snippet of code for getting each symbol from a text.

Now if we give the wrong dimensions for this bounding box(api->setRectangle) it doesn't give you the required text. So is there a way to estimate these dimensions in tesseract. Link to the source https://code.google.com/p/tesseract-ocr/wiki/APIExample#Example_of_iterator_over_the_classifier_choices_for_a_single_sym

c++ocr tesseract python-tesseract

2015-11-13T06:56:35.473

0 投票

1 回答

4870 浏览

python - 安装 tesseract-ocr 包时遇到问题 - “编译失败，在 /tmp/pip_build_root/tesseract-ocr 中出现错误代码 1”

尝试安装 tesseract-ocr 包以与 pytesseract 一起使用，遇到了一个奇怪的问题。使用 pip 安装其他所有内容都有效，但是当我按照此处sudo pip install tesseract-ocr的说明尝试时，出现以下错误：

我感觉回溯导致了 UnicodeDecodeError。有没有人对如何解决这个问题有任何想法？

python python-tesseract

2015-11-21T23:29:17.957

0 投票

0 回答

1020 浏览

python - pytesseract OCR 错误

我正在使用pytesseract从图像中获取文本。但我收到了这个错误。

这是我的源文件。包括dir图像

我收到此错误

如何消除此错误？

python python-2.7 tesseract python-tesseract

2015-11-27T11:15:06.590

0 投票

1 回答

28961 浏览

python - pytesseract 找不到指定的文件

我的代码很简单，如下所示：

我得到的错误响应是：

任何指导都会很棒。

将 tesseract 添加到我的路径变量有助于： C:\Program Files (x86)\Tesseract-OCR

但是现在代码在尝试运行 pytesseract 片段时会崩溃。

python tesseract python-tesseract

2015-12-11T14:34:51.527

0 投票

3 回答

12905 浏览

tesseract - 如何使用 python-tesseract 获取 Hocr 输出

我使用 pytesseract 得到了非常好的结果，但它不能保留双空格，它们对我来说真的很重要。而且，所以我决定检索 hocr 输出而不是纯文本。但是；似乎没有任何方法可以使用 pytessearct 指定配置文件。

那么，是否可以使用 pytesseract 指定配置文件，或者是否有一些默认配置文件可以更改以获取 hocr 输出？

tesseract python-tesseract hocr

2015-12-13T06:10:32.390

0 投票

2 回答

11648 浏览

python - UnicodeDecodeError 与 Python 中的 Tesseract OCR

我试图在 Python 中使用 Tesseract OCR 从图像文件中提取文本，但我面临一个错误，我可以弄清楚如何处理它。我所有的环境都很好，因为我在 python 中使用 ocr 测试了一些示例图像！

这是代码

以下是我从 Eclipse 控制台得到的错误

我在 Windows10上使用python 3.5 x64

python tesseract python-tesseract

2015-12-15T15:37:39.383

0 投票

1 回答

522 浏览

php - Python / PHP Tesseract 输出优化技巧

我有一个 python 脚本可以扫描收据，然后将其输出到扫描的文件中。在新文件上使用 tesseract imagefile outputfile。我可以获得良好的可读文本，但解析显示如下。有没有办法使用 tesseract 将购买的商品排列在价格旁边？我的偏好是用 PHP 或 Python 来做。

php python tesseract python-tesseract

2015-12-19T19:32:30.880

0 投票

0 回答

303 浏览

python - Windows Python 上的 Tesseract OCR

我安装了 Tesseract OCR 引擎和pytesseract客户端库，以便我可以读取图像中包含的字符串。无论我用哪种方式切蛋糕，都会出现以下错误。有任何想法吗？

python tesseract python-tesseract

2016-01-09T19:23:07.047

0 投票

1 回答

387 浏览

python - Pytesser中的数字字符识别

我正在做一个需要我从商品交易所获取价格的项目。不幸的是，交易所没有可让我从交易屏幕获取价格的网络服务或其他插件。

我想我可以自动截取价格并将所有价格分成单独的图像。之后，我使用用于 Tesseract 3.0.2 的 Pytesser V 0.0.1 库结合 Python v2.7 中的 Pillow 3.1.0 处理它们。然而，图像到文本的转换（通过 image_to_string 函数）是戏剧性的，因为在大多数情况下，0 变成 o 或 5 变成 s，有时转换是随机的，这使得仅替换这些字符变得困难。我已经将图像调整为更大的尺寸并使用了抗锯齿，但结果并没有变得更好。有没有办法将字符集限制为仅数字和小数点？以及如何提高转换质量？

也许我的方法太乏味了，你们知道更好的方法吗？感谢您的帮助:)

python python-imaging-library tesseract python-tesseract pytesser

2016-01-22T14:49:20.337

0 投票

4 回答

8704 浏览

python - 安装 tesseract-ocr 时出现 gcc 错误

我正在尝试在我的 Mac 上运行以下代码。

按照这里的问题：pytesseract-no such file or directory error I need to install tesseract-ocr

但是当我尝试 pip install tesseract-ocr 时，出现以下错误：

我不知道该怎么办。

python tesseract python-tesseract pytesser

2016-01-24T22:30:16.210

1 2 3 4 5 6 7 8 9 10

问题标签 [python-tesseract]

Reference