python - 如何在图像 plone 4.1.4 中使用 plone for ocr 使用collective.documentviewer？

Question

我有一个带有collective.documentviewer 2.2.1的plone（v.4.1.4）站点独立统一安装程序。只要从扫描的文档、xls、开放式办公室、rtf、pdf 中搜索单词，它就可以正常工作。如果图像（包含文本）作为图像内容类型上传，即使在文档设置中选中 OCR，文档查看器也不支持该图像。如果图像作为文件上传，我也无法在设置适当的图像格式（即 gif、png、jpg 后）搜索属于图像一部分的单词。我已经在我的 linux 系统上安装了通过以下命令获得的必要的 tesseract 文件：

dpkg -l| grep tesseract
ii  libtesseract3                        3.02.01-6                        i386         Command line OCR tool
ii  tesseract-ocr                        3.02.01-6                        i386         Command line OCR tool
ii  tesseract-ocr-eng                    3.02-2                           all          tesseract-ocr language files for English
ii  tesseract-ocr-equ                    3.02-2                           all          tesseract-ocr language files for equations
ii  tesseract-ocr-osd                    3.02-2                           all          tesseract-ocr language files for script and orientation

附上一个示例 gif 图像。在此处输入图像描述例如，我想搜索图像中的“实验室”一词。文本选项卡不显示此 pdf 中嵌入的图像的文字。请指导

python - 如何在图像 plone 4.1.4 中使用 plone for ocr 使用collective.documentviewer？

0 回答 0

Related

Reference