我正在尝试使用 python 在 windows 上尝试 MLlib。所以看来我需要 SPARK,而 SPARK 又需要 HADOOP。我已经安装了包含 python 2.7、numpy 等的 Anaconda2。
我一直在遵循这个食谱,在我看来,它主要是让我去我需要去的地方,但我认为我陷入了最后一个错误:
Python 2.7.13 |Anaconda 4.3.1 (64-bit)| (default, Dec 19 2016, 13:29:36) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Traceback (most recent call last):
File "C:\spark\bin\..\python\pyspark\shell.py", line 43, in <module>
spark = SparkSession.builder\
File "C:\spark\python\pyspark\sql\session.py", line 179, in getOrCreate
session._jsparkSession.sessionState().conf().setConfString(key, value)
File "C:\spark\python\lib\py4j-0.10.4-src.zip\py4j\java_gateway.py", line 1133, in __call__
File "C:\spark\python\pyspark\sql\utils.py", line 79, in deco
raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':"
从这个输出中可以清楚地看到没有关于 winutils.exe 未找到的错误。
此外,异常源自 py4j 的 java 域,但由于 IllegalArgumentException,我们丢失了回溯。
所有指导表示赞赏!
干杯