当尝试使用 PyPDF2 从 pdf 中获取数字时,我得到:
KeyError:'/内容'。这是代码:
import PyPDF2 as pdf
fhand = open('filepdf.pdf', 'rb')
reader = pdf.PdfFileReader(fhand)
if reader.isEncrypted == True:
pass
else:
for i in range(reader.getNumPages()):
for word in reader.getPage(i).extractText().split():
if word.isdigit():
print(word)
该代码适用于其他 pdf 文件,这是回溯:
Traceback (most recent call last):
File "C:\Users\Root\AppData\Local\Programs\Python\Python38-32\lib\runpy.py", line 193, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Root\AppData\Local\Programs\Python\Python38-32\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "c:\Users\Root\.vscode\extensions\ms-python.python-2020.4.76186\pythonFiles\lib\python\debugpy\no_wheels\debugpy\__main__.py", line 45, in <module>
cli.main()
File "c:\Users\Root\.vscode\extensions\ms-python.python-2020.4.76186\pythonFiles\lib\python\debugpy\no_wheels\debugpy/..\debugpy\server\cli.py", line 430, in main
run()
File "c:\Users\Root\.vscode\extensions\ms-python.python-2020.4.76186\pythonFiles\lib\python\debugpy\no_wheels\debugpy/..\debugpy\server\cli.py", line 267, in run_file
runpy.run_path(options.target, run_name=compat.force_str("__main__"))
File "C:\Users\Root\AppData\Local\Programs\Python\Python38-32\lib\runpy.py", line 263, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\Root\AppData\Local\Programs\Python\Python38-32\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\Root\AppData\Local\Programs\Python\Python38-32\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "c:\Users\Root\Desktop\test\test.py", line 9, in <module>
for word in reader.getPage(i).extractText().split():
File "C:\Users\Root\AppData\Local\Programs\Python\Python38-32\lib\site-packages\PyPDF2\pdf.py", line 2593, in extractText
content = self["/Contents"].getObject()
File "C:\Users\Root\AppData\Local\Programs\Python\Python38-32\lib\site-packages\PyPDF2\generic.py", line 516, in __getitem__
return dict.__getitem__(self, key).getObject()
KeyError: '/Contents'