10

I'm trying to build a very simple scala standalone app using the Mllib, but I get the following error when trying to bulid the program:

Object Mllib is not a member of package org.apache.spark

Then, I realized that I have to add Mllib as dependency as follow :

version := "1"
scalaVersion :="2.10.4"

libraryDependencies ++= Seq(
"org.apache.spark"  %% "spark-core"              % "1.1.0",
"org.apache.spark"  %% "spark-mllib"             % "1.1.0"
)

But, here I got an error that says :

unresolved dependency spark-core_2.10.4;1.1.1 : not found

so I had to modify it to

"org.apache.spark" % "spark-core_2.10" % "1.1.1",

But there is still an error that says :

unresolved dependency spark-mllib;1.1.1 : not found

Anyone knows how to add dependency of Mllib in .sbt file?

4

2 回答 2

9

正如@lmm 指出的那样,您可以将库包含为:

libraryDependencies ++= Seq( "org.apache.spark" % "spark-core_2.10" % "1.1.0", "org.apache.spark" % "spark-mllib_2.10" % "1.1.0" )

在 sbt %% 中包含 scala 版本,并且您正在使用 scala 版本 2.10.4 构建,而 Spark 工件通常是针对 2.10 发布的。

应该注意的是,如果您要制作一个组装 jar 来部署您的应用程序,您可能希望将 spark-core 标记为已提供,例如

libraryDependencies ++= Seq( "org.apache.spark" % "spark-core_2.10" % "1.1.0" % "provided", "org.apache.spark" % "spark-mllib_2.10" % "1.1.0" )

因为 spark-core 包无论如何都会在 executor 的路径中。

于 2014-12-13T00:36:04.397 回答
1

build.sbt如果您使用Databricks sbt-spark-package 插件,这是另一种将依赖项添加到文件的方法:

sparkComponents ++= Seq("sql","hive", "mllib")
于 2017-03-03T01:17:47.590 回答