python - Python3：模块'tabula'没有属性'read_pdf'

Question

一个.py程序可以工作，但完全相同的代码在作为 API 公开时不起作用。

该代码使用 Tabula 读取 pdf 并提供表格内容作为输出。

我试过了：

import tabula
df = tabula.read_pdf("my_pdf")
print(df)

和

from tabula import wrapper
df = wrapper.read_pdf("my_pdf")
print(df)

我已经在运行 Ubuntu 的 AWS EC2 上安装了 tabula-py（不是 tabula）。

除了 read_pdf，我实际上还想转换为 CSV 并给出输出。但这也行不通。我得到相同的无属性错误，即module 'tabula' has no attribute 'convert_into。

.py 文件和 API 文件（.py 也是如此）位于同一目录中，并由同一用户访问。

任何帮助将不胜感激。

编辑：我尝试从 API 运行与 OS 命令 ( os.system("python3 /home/ubuntu/flaskapp/tabler.py")) 相同的 python 文件。但它也没有奏效。

score 7 · Accepted Answer

确保您安装了 tabula-py 而不仅仅是 tabula 使用

!pip install tabula-py

并导入它使用

from tabula.io import read_pdf

score 4 · Accepted Answer

常见问题解答中实际上有一个关于此问题的条目：

如果您已经安装tabula，它将与命名空间发生冲突。您应该tabula-py在删除后安装tabula.

尽管使用read_csv()from working，正如其他答案所建议的那样，在删除并重新安装（使用）后tabula.io我也能够使用。tabula.read_csv()tabulatabula-pypip install --force-reinstall tabula-py

score 2 · Accepted Answer

如果您在安装 tabula-py 之前不小心安装了 tabula，它们会在命名空间中发生冲突（即使在卸载 tabula 之后）。

卸载 tabula-py 并重新安装它。这对我有用。

score 2 · Accepted Answer

tabula 包有一些问题。我看了看里面没有__init__.py。你可以做：

from tabula.io import read_pdf

它对我有用。

score 1 · Accepted Answer

from tabula import read_pdf对我不起作用。我已经替换tabula.read_pdf()为tabula.io.read_pdf()使其工作。

score 0 · Accepted Answer

它是这样工作的：

import tabula # just this here!

#declare the path of your file
file_path = "/path/to/pdf_file/data.pdf"

#Convert your file
df = tabula.io.read_pdf(file_path)

泰国就是一切！

score -1 · Accepted Answer

-1

尝试

from tabula import read_pdf

我有同样的问题，这解决了它。

于 2020-03-04T18:37:42.690 回答

python - Python3：模块'tabula'没有属性'read_pdf'

7 回答 7

Related

Reference