3

我已经使用 PikePDF 和 PDFminer3 构建了一个工作 py 脚本,它将从我的桌面上删除一个 PDF 并从可用的单词中创建一个 txt 文件。

这样做的目的是帮助我的团队在工作中修改法律文件,这些文件通常无法复制粘贴进行修改(因此必须手动输入)。由于我的大多数同事都不愿意设置 anaconda 和使用 python,所以我想使用 pyinstaller 将我的脚本变成 .exe。

当我运行 pyinstaller 创建的应用程序时,我可以在收到此错误之前完成一些初步输入:

    Traceback <most recent call last>:
      File 'PDF2TEXT.py', line 35, in <module>
    ModuleNotFoundError: No module named 'pikepdf._cpphelpers'
    (10688) Failed to execute script PDF2TEXT

在 pyinstaller 的编译过程中,我还收到很多与缺少 anaconda3 dll 文件有关的连续警告:

Warning: lib not found: msmpi.dll dependency of c:\users\anejar1\appdata\local\continuum\anaconda3\Library\bin\mkl_blacs_mspi_ilp64.dll

我做了一些挖掘并在其他线程上应用了一些解决方案但没有成功,包括运行:

pyinstaller --path= [path to pikepdf] --path= [path to pdfminer3] -F PDF2TEXT.py

pyinstaller --hidden-import=pikepdf --hidden-import=pdfminer3 -F PDF2TEXT.py

底层代码很短(并且工作正常),仅从 pikepdf、pdfminer 和 os 导入:

"""
from pdfminer3.layout import LAParams, LTTextBox, LTTextLine
from pdfminer3.pdfpage import PDFPage, PDFTextExtractionNotAllowed
from pdfminer3.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer3.converter import PDFPageAggregator
from pdfminer3.pdfparser import PDFParser
from pdfminer3.pdfdocument import PDFDocument
from pdfminer3.pdfdevice import PDFDevice
import io
import os
import pikepdf

print('Enter your CBA User ID:')
CBAusername = input()
print('\nEnter the name of your PDF:')
FileName = input()

#############################

base_path = "C:/Users/" + CBAusername + "/Desktop"
pike_path = "C:/Users/" + CBAusername + "/AppData/Roaming/Python/Python36/PikePDF"
my_file = os.path.join(base_path + "/" + FileName + ".pdf")
my_extractable_file = os.path.join(pike_path + "/" + "PikedPDF2TEXT.pdf")
log_file = os.path.join(base_path + "/" + "PDF2TEXT.txt")

password = ""
extract = ""

pdf = pikepdf.open(my_file)
pdf.save(my_extractable_file)

fp = open(my_extractable_file, "rb")
parser = PDFParser(fp)
document = PDFDocument(parser, password)

if  not document.is_extractable:
   raise PDFTextExtractionNotAllowed

rsrcmgr = PDFResourceManager()
laparams = LAParams()

device = PDFPageAggregator(rsrcmgr, laparams=laparams)

interpreter = PDFPageInterpreter(rsrcmgr, device)

for page in PDFPage.create_pages(document):
    interpreter.process_page(page)
    layout = device.get_result()
    for lt_obj in layout:
        if isinstance(lt_obj,LTTextBox) or isinstance(lt_obj, LTTextLine):
            extract += lt_obj.get_text()

fp.close()

print(extract.encode("utf-8"))

with open(log_file, "wb") as my_log:
    my_log.write(extract.encode("utf-8"))
    print ("Done !!")

如果它对提供的信息有任何进一步的用途,pyinstaller 进程的完整日志在这里:

(C:\Users\anejar1\AppData\Local\Continuum\anaconda3) C:\Users\anejar1>cd C:\User
s\anejar1\AppData\Roaming\Python\Python36\Scripts

(C:\Users\anejar1\AppData\Local\Continuum\anaconda3) C:\Users\anejar1\AppData\Ro
aming\Python\Python36\Scripts>pyinstaller --hidden-import=pdfminer3 --hiddenimpo
rt=pikepdf -F PDF2TEXT.py
214 INFO: PyInstaller: 3.5
214 INFO: Python: 3.6.3
215 INFO: Platform: Windows-8.1-6.3.9600-SP0
218 INFO: wrote C:\Users\anejar1\AppData\Roaming\Python\Python36\Scripts\PDF2TEX
T.spec
221 INFO: UPX is not available.
225 INFO: Extending PYTHONPATH with paths
['C:\\Users\\anejar1\\AppData\\Roaming\\Python\\Python36\\Scripts',
 'C:\\Users\\anejar1\\AppData\\Roaming\\Python\\Python36\\Scripts']
225 INFO: checking Analysis
225 INFO: Building Analysis because Analysis-00.toc is non existent
226 INFO: Initializing module dependency graph...
235 INFO: Initializing module graph hooks...
240 INFO: Analyzing base_library.zip ...
8340 INFO: Analyzing hidden import 'pdfminer3'
8613 INFO: Analyzing hidden import 'pikepdf'
9570 INFO: Processing pre-find module path hook   distutils
10124 INFO: Processing pre-find module path hook   site
10125 INFO: site: retargeting to fake-dir 'C:\\Users\\anejar1\\AppData\\Roaming\
\Python\\Python36\\site-packages\\PyInstaller\\fake-modules'
15732 INFO: Processing pre-safe import module hook   setuptools.extern.six.moves

26995 INFO: running Analysis Analysis-00.toc
27021 INFO: Adding Microsoft.Windows.Common-Controls to dependent assemblies of
final executable
  required by c:\users\anejar1\appdata\local\continuum\anaconda3\python.exe
27608 INFO: Caching module hooks...
27623 INFO: Analyzing C:\Users\anejar1\AppData\Roaming\Python\Python36\Scripts\P
DF2TEXT.py
31216 INFO: Loading module hooks...
31217 INFO: Loading module hook "hook-Crypto.py"...
31252 INFO: Loading module hook "hook-distutils.py"...
31255 INFO: Loading module hook "hook-encodings.py"...
31453 INFO: Loading module hook "hook-lib2to3.py"...
31465 INFO: Loading module hook "hook-lxml.etree.py"...
31467 INFO: Loading module hook "hook-PIL.Image.py"...
33168 INFO: Loading module hook "hook-PIL.py"...
33172 INFO: Excluding import 'PySide'
33177 INFO:   Removing import of PySide from module PIL.ImageQt
33178 INFO: Excluding import 'tkinter'
33182 INFO:   Removing import of tkinter from module PIL.ImageTk
33183 INFO: Import to be excluded not found: 'FixTk'
33184 INFO: Excluding import 'PyQt5'
33187 INFO:   Removing import of PyQt5.QtGui from module PIL.ImageQt
33187 INFO:   Removing import of PyQt5.QtCore from module PIL.ImageQt
33190 INFO: Excluding import 'PyQt4'
33194 INFO:   Removing import of PyQt4 from module PIL.ImageQt
33196 INFO: Loading module hook "hook-PIL.SpiderImagePlugin.py"...
33199 INFO: Excluding import 'tkinter'
33201 INFO: Import to be excluded not found: 'FixTk'
33202 INFO: Loading module hook "hook-pkg_resources.py"...
35151 INFO: Processing pre-safe import module hook   win32com
35775 INFO: Loading module hook "hook-pycparser.py"...
36116 INFO: Loading module hook "hook-pydoc.py"...
36118 INFO: Loading module hook "hook-PyQt5.py"...
36389 INFO: Loading module hook "hook-PyQt5.QtCore.py"...
36390 INFO: Loading module hook "hook-PyQt5.QtGui.py"...
36393 INFO: Loading module hook "hook-pythoncom.py"...
37711 INFO: Loading module hook "hook-pywintypes.py"...
38928 INFO: Loading module hook "hook-setuptools.py"...
54087 INFO: Loading module hook "hook-sysconfig.py"...
54090 INFO: Loading module hook "hook-win32com.py"...
55853 INFO: Loading module hook "hook-xml.dom.domreg.py"...
55855 INFO: Loading module hook "hook-xml.py"...
55858 INFO: Loading module hook "hook-_tkinter.py"...
56277 INFO: checking Tree
56278 INFO: Building Tree because Tree-00.toc is non existent
56278 INFO: Building Tree Tree-00.toc
56519 INFO: checking Tree
56519 INFO: Building Tree because Tree-01.toc is non existent
56520 INFO: Building Tree Tree-01.toc
56553 INFO: Loading module hook "hook-numpy.core.py"...
56851 INFO: MKL libraries found when importing numpy. Adding MKL to binaries
56859 INFO: Loading module hook "hook-numpy.py"...
56863 INFO: Loading module hook "hook-pytest.py"...
59074 INFO: Loading module hook "hook-scipy.py"...
59077 WARNING: Hidden import "scipy._lib.messagestream" not found!
59078 WARNING: Hidden import "scipy._lib._fpumode" not found!
59162 INFO: Looking for ctypes DLLs
59242 INFO: Analyzing run-time hooks ...
59252 INFO: Including run-time hook 'pyi_rth_pkgres.py'
59255 INFO: Including run-time hook 'pyi_rth_win32comgenpy.py'
59258 INFO: Including run-time hook 'pyi_rth_multiprocessing.py'
59264 INFO: Including run-time hook 'pyi_rth_pyqt5.py'
59286 INFO: Looking for dynamic libraries
59698 WARNING: lib not found: msmpi.dll dependency of c:\users\anejar1\appdata\l
ocal\continuum\anaconda3\Library\bin\mkl_blacs_msmpi_ilp64.dll
59779 WARNING: lib not found: impi.dll dependency of c:\users\anejar1\appdata\lo
cal\continuum\anaconda3\Library\bin\mkl_blacs_intelmpi_ilp64.dll
59850 WARNING: lib not found: msmpi.dll dependency of c:\users\anejar1\appdata\l
ocal\continuum\anaconda3\Library\bin\mkl_blacs_msmpi_lp64.dll
60005 WARNING: lib not found: tbb.dll dependency of c:\users\anejar1\appdata\loc
al\continuum\anaconda3\Library\bin\mkl_tbb_thread.dll
60226 WARNING: lib not found: pgf90.dll dependency of c:\users\anejar1\appdata\l
ocal\continuum\anaconda3\Library\bin\mkl_pgi_thread.dll
60294 WARNING: lib not found: pgf90rtl.dll dependency of c:\users\anejar1\appdat
a\local\continuum\anaconda3\Library\bin\mkl_pgi_thread.dll
60355 WARNING: lib not found: pgc.dll dependency of c:\users\anejar1\appdata\loc
al\continuum\anaconda3\Library\bin\mkl_pgi_thread.dll
60621 WARNING: lib not found: mpich2mpi.dll dependency of c:\users\anejar1\appda
ta\local\continuum\anaconda3\Library\bin\mkl_blacs_mpich2_lp64.dll
61074 WARNING: lib not found: mpich2mpi.dll dependency of c:\users\anejar1\appda
ta\local\continuum\anaconda3\Library\bin\mkl_blacs_mpich2_ilp64.dll
61203 WARNING: lib not found: impi.dll dependency of c:\users\anejar1\appdata\lo
cal\continuum\anaconda3\Library\bin\mkl_blacs_intelmpi_lp64.dll
65487 INFO: Found C:\windows\WinSxS\Manifests\amd64_policy.9.0.microsoft.vc90.cr
t_1fc8b3b9a1e18e3b_9.0.30729.6161_none_acd388d7e1d8689f.manifest
65496 INFO: Found C:\windows\WinSxS\Manifests\amd64_policy.9.0.microsoft.vc90.cr
t_1fc8b3b9a1e18e3b_9.0.30729.8387_none_acd5043fe1d73003.manifest
65811 INFO: Searching for assembly amd64_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0
.30729.8387_none ...
65812 INFO: Found manifest C:\windows\WinSxS\Manifests\amd64_microsoft.vc90.crt_
1fc8b3b9a1e18e3b_9.0.30729.8387_none_08e793bfa83a89b5.manifest
65815 INFO: Searching for file msvcr90.dll
65816 INFO: Found file C:\windows\WinSxS\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e
3b_9.0.30729.8387_none_08e793bfa83a89b5\msvcr90.dll
65817 INFO: Searching for file msvcp90.dll
65818 INFO: Found file C:\windows\WinSxS\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e
3b_9.0.30729.8387_none_08e793bfa83a89b5\msvcp90.dll
65818 INFO: Searching for file msvcm90.dll
65819 INFO: Found file C:\windows\WinSxS\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e
3b_9.0.30729.8387_none_08e793bfa83a89b5\msvcm90.dll
66044 INFO: Found C:\windows\WinSxS\Manifests\amd64_policy.9.0.microsoft.vc90.cr
t_1fc8b3b9a1e18e3b_9.0.30729.6161_none_acd388d7e1d8689f.manifest
66045 INFO: Found C:\windows\WinSxS\Manifests\amd64_policy.9.0.microsoft.vc90.cr
t_1fc8b3b9a1e18e3b_9.0.30729.8387_none_acd5043fe1d73003.manifest
66047 INFO: Adding redirect Microsoft.VC90.CRT version (9, 0, 21022, 8) -> (9, 0
, 30729, 8387)
66111 WARNING: lib not found: MSVCR90.dll dependency of c:\users\anejar1\appdata
\local\continuum\anaconda3\Library\bin\zlib.dll
66813 INFO: Looking for eggs
66813 INFO: Using Python library c:\users\anejar1\appdata\local\continuum\anacon
da3\python36.dll
66814 INFO: Found binding redirects:
[BindingRedirect(name='Microsoft.VC90.CRT', language=None, arch='amd64', oldVers
ion=(9, 0, 21022, 8), newVersion=(9, 0, 30729, 8387), publicKeyToken='1fc8b3b9a1
e18e3b')]
66846 INFO: Warnings written to C:\Users\anejar1\AppData\Roaming\Python\Python36
\Scripts\build\PDF2TEXT\warn-PDF2TEXT.txt
67269 INFO: Graph cross-reference written to C:\Users\anejar1\AppData\Roaming\Py
thon\Python36\Scripts\build\PDF2TEXT\xref-PDF2TEXT.html
67347 INFO: checking PYZ
67348 INFO: Building PYZ because PYZ-00.toc is non existent
67349 INFO: Building PYZ (ZlibArchive) C:\Users\anejar1\AppData\Roaming\Python\P
ython36\Scripts\build\PDF2TEXT\PYZ-00.pyz
70044 INFO: Building PYZ (ZlibArchive) C:\Users\anejar1\AppData\Roaming\Python\P
ython36\Scripts\build\PDF2TEXT\PYZ-00.pyz completed successfully.
70092 INFO: checking PKG
70093 INFO: Building PKG because PKG-00.toc is non existent
70093 INFO: Building PKG (CArchive) PKG-00.pkg
194058 INFO: Building PKG (CArchive) PKG-00.pkg completed successfully.
194071 INFO: Bootloader C:\Users\anejar1\AppData\Roaming\Python\Python36\site-pa
ckages\PyInstaller\bootloader\Windows-64bit\run.exe
194072 INFO: checking EXE
194073 INFO: Building EXE because EXE-00.toc is non existent
194073 INFO: Building EXE from EXE-00.toc
194074 INFO: Appending archive to EXE C:\Users\anejar1\AppData\Roaming\Python\Py
thon36\Scripts\dist\PDF2TEXT.exe
195004 INFO: Building EXE from EXE-00.toc completed successfully.

(C:\Users\anejar1\AppData\Local\Continuum\anaconda3) C:\Users\anejar1\AppData\Ro
aming\Python\Python36\Scripts>```
4

4 回答 4

5

我找到了解决此错误的方法:

No module named 'pikepdf._cpphelpers'

只需添加:

from pikepdf import _cpphelpers

到脚本的顶部

于 2019-10-18T10:41:42.987 回答
1

我认为您需要为您的 python 版本尝试 pikepdf。

请参考以下链接安装模块 pikepdf

于 2019-10-09T07:32:57.557 回答
0

我经常遇到这个问题,我认为 pyinstaller 缺少某些模块。

您需要做的是找到丢失模块的位置,在您的情况下为 ikePDF 和 PDFminer3

转到您的 python 目录,它应该是这样的: C:\Users\\AppData\Local\Programs\Python\Python37-32\Lib\site-packages

复制您丢失的软件包的dist文件夹并粘贴到您的 exe 应用程序文件夹的文件夹中

于 2019-10-18T11:18:41.333 回答
0

我也有这个问题。就我而言,解决方案很简单。第一步是在我的 script.py 文件上添加这个代码行 from pikepdf import _tcp helpers ,第二步是在命令行中添加一个参数 --hidden-import=['pikepdf._cpphelpers'],然后一切都开始工作了;)

于 2020-09-27T11:52:11.863 回答