我正在尝试以这种方式使用 pyspark.mllib.stat.KernelDensity:
data = sc.parallelize([0, 1, 2, 2, 1, 1, 1, 1, 1, 2, 0, 0])
kd = KernelDensity()
kd.setSample(data)
kd.setBandwidth(3)
densities = kd.estimate([-1.0, 2.0, 5.0])
但最终得到这个错误:
-------------------------------------------------- ------------------------- Py4JError Traceback (most recent call last) in () 8 9 # 查找给定值的密度估计 ---> 10 密度 = kd.estimate([-1.0, 2.0, 5.0])
/home/user10215193/anaconda3/lib/python3.6/site-packages/pyspark/mllib/stat/KernelDensity.py 估计(自我,点)56点=列表(点)57密度= callMLlibFunc(---> 58 “estimateKernelDensity”,self._sample,self._bandwidth,点)59 返回 np.asarray(密度)
/home/user10215193/anaconda3/lib/python3.6/site-packages/pyspark/mllib/common.py in callMLlibFunc(name, *args) 129 api = getattr(sc._jvm.PythonMLLibAPI(), name) 130 print( api) --> 131 返回调用JavaFunc(sc, api, *args) 132 133
/home/user10215193/anaconda3/lib/python3.6/site-packages/pyspark/mllib/common.py in callJavaFunc(sc, func, *args) 121 """ 调用 Java 函数 """ 122 args = [_py2java( sc, a) for a in args] --> 123 return _java2py(sc, func(*args)) 124 125
/home/user10215193/anaconda3/lib/python3.6/site-packages/py4j/java_gateway.py in call (self, *args) 1131 answer = self.gateway_client.send_command(command) 1132 return_value = get_return_value( -> 1133 answer , self.gateway_client, self.target_id, self.name) 1134 1135 对于 temp_args 中的 temp_arg:
/home/user10215193/anaconda3/lib/python3.6/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name) 321 raise Py4JError( 322 "调用 {0}{1} 时出错{2}. Trace:\n{3}\n".-> 323 格式(target_id, ".", name, value)) 324 else: 325 raise Py4JError(
Py4JError:调用 o19.estimateKernelDensity 时出错。Trace: py4j.Py4JException: Method estimateKernelDensity([class org.apache.spark.api.java.JavaRDD, class java.lang.Integer, class java.util.ArrayList]) 不存在于 py4j.reflection.ReflectionEngine.getMethod( ReflectionEngine.java:318) at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326) at py4j.Gateway.invoke(Gateway.java:272) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:214) at java.lang.Thread.run(Thread.java:748)
我在这里找不到类似的东西,所以如果有人可以帮助我,我将不胜感激。