1

我按照这里的说明进行操作:file:///home/bioinfo/Descargas/pdfminer3k-1.3.0/docs/index.html

下载pdfminer3k-1.3.0后我做了:

python setup.py 安装

但是当我这样做时

pdf2txt.py 样本/simple1.pdf

而且它不读取pdf,路径还可以。它给了我错误:

>

Traceback (most recent call last):
  File "/usr/local/bin/pdf2txt.py", line 5, in <module>
    pkg_resources.run_script('pdfminer3k==1.3.0', 'pdf2txt.py')
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 528, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1394, in run_script
    execfile(script_filename, namespace, namespace)
  File "/usr/local/lib/python2.7/dist-packages/pdfminer3k-1.3.0-py2.7.egg/EGG-INFO/scripts/pdf2txt.py", line 6, in <module>
    from pdfminer.pdfinterp import PDFResourceManager, process_pdf
  File "/usr/local/lib/python2.7/dist-packages/pdfminer3k-1.3.0-py2.7.egg/pdfminer/pdfinterp.py", line 5, in <module>
    from .cmapdb import CMapDB, CMap
  File "/usr/local/lib/python2.7/dist-packages/pdfminer3k-1.3.0-py2.7.egg/pdfminer/cmapdb.py", line 23, in <module>
    from .psparser import PSStackParser
  File "/usr/local/lib/python2.7/dist-packages/pdfminer3k-1.3.0-py2.7.egg/pdfminer/psparser.py", line 4, in <module>
    from .utils import choplist
  File "/usr/local/lib/python2.7/dist-packages/pdfminer3k-1.3.0-py2.7.egg/pdfminer/utils.py", line 212, in <module>
    0x00f8, 0x00f9, 0x00fa, 0x00fb, 0x00fc, 0x00fd, 0x00fe, 0x00ff,
  File "/usr/local/lib/python2.7/dist-packages/pdfminer3k-1.3.0-py2.7.egg/pdfminer/utils.py", line 180, in <genexpr>
    PDFDocEncoding = ''.join( chr(x) for x in (
ValueError: chr() arg not in range(256)

有什么解决办法吗?

4

1 回答 1

8

最新代码(版本 20140328)使用unichr(). 试试这个:

pip install pdfminer==20140328

或从https://pypi.python.org/pypi/pdfminer/20140328下载。

于 2015-03-08T17:49:02.580 回答