python-3.x - PyArrow OSError: [WinError 193] %1 不是有效的 win32 应用程序

Question

我的操作系统是 Windows 10 64 位，我使用 Anaconda 3.8 64 位。我尝试使用 PyArrow 模块开发 Hadoop File System 3.3 客户端。在 Windows 10 上使用 conda 安装 PyArrow 是成功的。

> conda install -c conda-forge pyarrow

但是 hdfs 3.3 与 pyarrow 的连接会引发如下错误，

import pyarrow as pa
fs = pa.hdfs.connect(host='localhost', port=9000)

错误是

Traceback (most recent call last):
  File "C:\eclipse-workspace\PythonFredProj\com\aaa\fred\hdfs3-test.py", line 14, in <module>
    fs = pa.hdfs.connect(host='localhost', port=9000)
  File "C:\Python-3.8.3-x64\lib\site-packages\pyarrow\hdfs.py", line 208, in connect
    fs = HadoopFileSystem(host=host, port=port, user=user,
  File "C:\Python-3.8.3-x64\lib\site-packages\pyarrow\hdfs.py", line 38, in __init__
    _maybe_set_hadoop_classpath()
  File "C:\Python-3.8.3-x64\lib\site-packages\pyarrow\hdfs.py", line 136, in _maybe_set_hadoop_classpath
    classpath = _hadoop_classpath_glob(hadoop_bin)
  File "C:\Python-3.8.3-x64\lib\site-packages\pyarrow\hdfs.py", line 163, in _hadoop_classpath_glob
    return subprocess.check_output(hadoop_classpath_args)
  File "C:\Python-3.8.3-x64\lib\subprocess.py", line 411, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "C:\Python-3.8.3-x64\lib\subprocess.py", line 489, in run
    with Popen(*popenargs, **kwargs) as process:
  File "C:\Python-3.8.3-x64\lib\subprocess.py", line 854, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Python-3.8.3-x64\lib\subprocess.py", line 1307, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
OSError: [WinError 193] %1 is not a valid win32 application

我在 Windows 10 上安装了 Visual C++ 2015。但仍然显示相同的错误。

score 0 · Accepted Answer

这是我的解决方案。

在启动 pyarrow 之前，必须在 Windows 10 64 位上安装 Hadoop 3。并且必须在 Path 上设置安装路径
安装pyarrow 3.0（版本很重要。必须是3.0）

pip install pyarrow==3.0
在 Eclipse PyDev 透视图中创建 PyDev 模块。示例代码如下

从 pyarrow 导入 fs

hadoop = fs.HadoopFileSystem("localhost", port=9000) print(hadoop.get_file_info('/'))
选择您创建的 pydev 模块并单击[Properties (Alt + Enter)]
单击[运行/调试设置]。选择 pydev 模块和 [Edit] 按钮。
在 [Edit Configuration] 窗口中，选择 [Environment] 选项卡
点击【添加】按钮
您必须制作 2 个环境变量。“CLASSPATH”和“LD_LIBRARY_PATH”