我目前正在测试 Mleap 解决方案,以便对 Spark 模型进行预测。为了做到这一点,我首先实现了线性回归的 Spark 示例,如下所述:https ://spark.apache.org/docs/2.3.0/ml-classification-regression.html#linear-regression I'已经能够将模型保存在 Mleap 包中并在另一个 Spark 上下文中重用。现在,我想在 Mleap 运行时使用这个包,但我面临一些强制转换问题,使其无法正常工作
错误来自架构定义:
val dataSchema = StructType(Seq(
StructField("label", ScalarType.Double),
StructField("features", ListType.Double)
)).get
“功能”部分是一组分组的列。我尝试了很多事情,但没有运气:
StructField("label", ScalarType.Double),
StructField("features", ListType.Double)
)).get
=> 这给了我
java.lang.IllegalArgumentException: Cannot cast ListType(double,true) to TensorType(double,Some(WrappedArray(10)),true)
所以我尝试了:
val dataSchema = StructType(Seq(
StructField("label", ScalarType.Double),
StructField("features", TensorType.Double(10))
)).get
但它给了我
java.lang.ClassCastException: scala.collection.immutable.$colon$colon cannot be cast to ml.combust.mleap.tensor.Tensor
这是整个代码:
val dataSchema = StructType(Seq(
StructField("label", ScalarType.Double),
StructField("features", TensorType.Double(10))
)).get
val data = Seq(Row(-9.490009878824548, Seq(0.4551273600657362, 0.36644694351969087, -0.38256108933468047, -0.4458430198517267, 0.33109790358914726,0.8067445293443565, -0.2624341731773887,-0.44850386111659524,-0.07269284838169332, 0.5658035575800715)))
val bundle = (for(bundleFile <- managed(BundleFile("jar:file:/tmp/spark-lrModel.zip"))) yield {
bundleFile.loadMleapBundle().get
}).tried.get
var model = bundle.root
val to_test = DefaultLeapFrame(dataSchema, data)
val res = model.transform(to_test).get // => Here is the place which raises the exception
我现在对这种类型映射有点迷失了。任何想法?
谢谢,
斯蒂芬妮