3

我想使用MLeap部署 Spark ML 机器学习模型并使用它们进行实时预测。

创建者发布了 Scala 教程,但我需要支持 Java 8 代码库。

我将如何在 Java 8 中实现以下代码:

val pipeline = SparkUtil.createPipelineModel(uid = "pipeline", Array(featureModel, rfModel))

val sbc = SparkBundleContext()
for(bf <- managed(BundleFile("jar:file:/tmp/mnist.model.rf.zip"))) {
        pipeline.writeBundle.save(bf)(sbc).get
      }

val bundle = (for(bundleFile <- managed(BundleFile("jar:file:/tmp/simple-spark-pipeline.zip"))) yield {
  bundleFile.loadMleapBundle().get
}).opt.get
4

2 回答 2

2

如果您只使用普通的 Spark ML 转换器,则可以使用SimpleSparkSerializer轻松保存和加载模型。

保存:

new SimpleSparkSerializer().serializeToBundle(model, "jar:file:/tmp/model.zip", trainData);

加载:

Transformer model = new SimpleSparkSerializer().deserializeFromBundle("jar:file:/tmp/model.zip");
于 2017-07-20T10:10:29.433 回答
0

您可以跳过加载火花,它是庞大的类并直接通过运行时加载。

private static Transformer kMeansModel;
private static MleapContext mleapContext;
private static BundleBuilder bundleBuilder;

public MLeapLocalService() throws IOException {
    mleapContext = new ContextBuilder().createMleapContext();
    bundleBuilder = new BundleBuilder();
    Resource res = resourceLoader.getResource("classpath:aihello.com/aimodels/kmeans-model.zip");
    kMeansModel = bundleBuilder.load(res.getFile(), mleapContext).root();
}

然后您可以通过以下方式进行预测:

    LeapFrameBuilder builder = new LeapFrameBuilder();
    List<StructField> fields = new ArrayList<StructField>();
    fields.add(builder.createField("docs", builder.createString()));
    StructType schema = builder.createSchema(fields);
    List<Row> rows = new ArrayList<Row>();
    rows.add(builder.createRow(docs));
    DefaultLeapFrame frame = builder.createFrame(schema, rows);
    DefaultLeapFrame returnFrame = kMeansModel.transform(frame).get();
于 2019-12-27T00:16:02.657 回答