我正在尝试使用库L1
对数据框中的列值进行规范化。pyspark ML
以下是我的代码。但它做不到。你能帮我弄清楚这段代码有什么问题吗?
from pyspark.ml.feature import Normalizer
y = range(1,10)
data = spark.createDataFrame([[float(e), ] for e in y])
#data.select('_1').show()
normalizer = Normalizer(p=1.0, inputCol="_1", outputCol="features")
data2 = normalizer.transform(data)
data2.select("features").show()
以下是错误日志的一部分。
Py4JJavaError: An error occurred while calling o857.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 36.0 failed 4 times, most recent failure: Lost task 0.3
in stage 36.0 (TID 67, XXXXX.serveraddress.com):
org.apache.spark.SparkException: Failed to execute user defined
function($anonfun$createTransformFunc$1: (double) => vector)