0

我有一个 python 烧瓶应用程序,它在内部使用 tabula 从 pdf 文件中提取表格。在我执行“cf push”并在 PCF 上运行应用程序后,我将 pdf 文件加载到应用程序以读取表格。当应用程序尝试提取表格数据时,出现以下错误。

2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] [2020-08-10 08:08:40,134] ERROR in app: Exception on / [POST]
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] Traceback (most recent call last):
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/deps/0/python/lib/python3.8/site-packages/tabula/io.py", line 80, in _run
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] result = subprocess.run(
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/deps/0/python/lib/python3.8/subprocess.py", line 489, in run
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] with Popen(*popenargs, **kwargs) as process:
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/deps/0/python/lib/python3.8/subprocess.py", line 854, in __init__
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] self._execute_child(args, executable, preexec_fn, close_fds,
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/deps/0/python/lib/python3.8/subprocess.py", line 1702, in _execute_child
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] raise child_exception_type(errno_num, err_msg, err_filename)
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] FileNotFoundError: [Errno 2] No such file or directory: 'java'
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] During handling of the above exception, another exception occurred:
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] Traceback (most recent call last):
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/deps/0/python/lib/python3.8/site-packages/flask/app.py", line 2446, in wsgi_app
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] response = self.full_dispatch_request()
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/deps/0/python/lib/python3.8/site-packages/flask/app.py", line 1951, in full_dispatch_request
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] rv = self.handle_user_exception(e)
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/deps/0/python/lib/python3.8/site-packages/flask/app.py", line 1820, in handle_user_exception
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] reraise(exc_type, exc_value, tb)
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/deps/0/python/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] raise value
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/deps/0/python/lib/python3.8/site-packages/flask/app.py", line 1949, in full_dispatch_request
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] rv = self.dispatch_request()
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/deps/0/python/lib/python3.8/site-packages/flask/app.py", line 1935, in dispatch_request
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] return self.view_functions[rule.endpoint](**req.view_args)
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "app.py", line 55, in index
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] wireListDF = pdfExtractorOBJ.getWireListDataFrame()
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/app/WireHarnessPDFExtractor.py", line 158, in getWireListDataFrame
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] self.readBTPPDF()
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/app/WireHarnessPDFExtractor.py", line 31, in readBTPPDF
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] df = tabula.read_pdf(self.pdf_path, pages='all', stream=True ,guess=True, encoding="utf-8",
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/deps/0/python/lib/python3.8/site-packages/tabula/io.py", line 322, in read_pdf
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] output = _run(java_options, kwargs, path, encoding)
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/deps/0/python/lib/python3.8/site-packages/tabula/io.py", line 91, in _run
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] raise JavaNotFoundError(JAVA_NOT_FOUND_ERROR)
2020-08-10T13:38:40.135+05:30 [APP/PROC/WEB/0] [ERR] tabula.errors.JavaNotFoundError: `java` command is not found from this Python process.Please ensure Java is installed and PATH is set for `java`
2020-08-10T13:38:40.136+05:30 [APP/PROC/WEB/0] [ERR] 10.255.10.112 - - [10/Aug/2020 08:08:40] "[35m[1mPOST / HTTP/1.1[0m" 500 -

我知道 tabula 具有 java 依赖项,任何关于如何使用 tabula 设置 python 烧瓶应用程序以便它可以在 PCF 平台上使用的建议。

4

2 回答 2

1

这是一个 java 路径错误。您的 python 运行时根本无法找到 java。您需要确保您的 export java 在您的导出路径变量中。如果你在linux上运行这个进程,你可以导出export PATH=<your java bin dir>:$PATH

于 2020-08-10T08:24:02.273 回答
0

亮点:

  • 您需要多个 buildpack,一个用于 Java,一个用于 Python
  • 您想使用 apt-buildpack,而不是 Java buildpack
  • 您需要将 PATH 设置为指向 apt-buildpack 安装 Java 的位置(或让您的应用在此特定位置查找 Java)
  • .profile您可以在文件中设置 PATH 。

我对这个类似问题的回答中解释了所有这些。

于 2020-08-15T17:40:32.207 回答