1

我下载了 pdfminer,命令行方法运行良好,但我希望能够同时转换多个 pdf 文档,所以我尝试使用 pdfminer 作为库,我找到了这个 os stackoverflow,但我无法得到它工作..

from pdfminer.pdfinterp import PDFResourceManager, process_pdf
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams
from cStringIO import StringIO



def convert_pdf(path):

    rsrcmgr = PDFResourceManager()
    retstr = StringIO()
    codec = 'utf-8'
    laparams = LAParams()
    device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)

    fp = file(path, 'rb')
    process_pdf(rsrcmgr, device, fp)
    fp.close()
    device.close()

    str = retstr.getvalue()
    retstr.close()
    print str


convert_pdf("/Users/gorkemyurtseven/Desktop/casino.pdf")

当我运行它时,我得到:

Traceback (most recent call last):
  File "pdfminer.py", line 1, in <module>
    from pdfminer.pdfinterp import PDFResourceManager, process_pdf
  File "/Users/gorkemyurtseven/Desktop/pdfminer.py", line 1, in <module>
    from pdfminer.pdfinterp import PDFResourceManager, process_pdf
ImportError: No module named pdfinterp
4

2 回答 2

2

似乎您将脚本pdfminer称为模块,并且在尝试导入具有相同名称的模块时会发疯。

另一个原因可能是pdfminer模块安装不正确,或者它不是您的 python 发行版的正确版本。

于 2013-11-07T09:56:33.610 回答
0

本文所述,问题在于您的文件名为pdfminer.py.
更改名称并删除创建的__pycache__/目录和 pdfminer.pyc文件:

$ rm -r __pycache__/ pdfminer.pyc
$ mv pdfminer.py mypdfminer.py
于 2018-10-04T17:15:08.990 回答