我试图从这两个(this和this)线程中解决它,它在我自己的虚拟机上对我有用,但在云 dataproc 中不起作用。我对他们两个都做了同样的过程。但是云端仍然存在错误,与之前虚拟机中的错误相同。云上应该怎么做才能解决?
问问题
310 次
1 回答
1
您是否在这些链接的线程中执行了完整的“git clone”步骤?您是否需要实际修改 jblas?如果没有,您应该使用--packages org.jblas:jblas:1.2.4
不使用git clone
or的方式将它们从 maven Central 中拉出mvn install
;以下对我来说在新的 Dataproc 集群上运行良好:
$ spark-shell --packages org.jblas:jblas:1.2.4
Ivy Default Cache set to: /home/dhuo/.ivy2/cache
The jars for the packages stored in: /home/dhuo/.ivy2/jars
:: loading settings :: url = jar:file:/usr/lib/spark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.jblas#jblas added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found org.jblas#jblas;1.2.4 in central
downloading https://repo1.maven.org/maven2/org/jblas/jblas/1.2.4/jblas-1.2.4.jar ...
[SUCCESSFUL ] org.jblas#jblas;1.2.4!jblas.jar (605ms)
:: resolution report :: resolve 713ms :: artifacts dl 608ms
:: modules in use:
org.jblas#jblas;1.2.4 from central in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 1 | 1 | 1 | 0 || 1 | 1 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
1 artifacts copied, 0 already retrieved (10360kB/29ms)
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
ivysettings.xml file not found in HIVE_HOME or HIVE_CONF_DIR,/etc/hive/conf.dist/ivysettings.xml will be used
Spark context Web UI available at http://10.240.2.221:4040
Spark context available as 'sc' (master = yarn, app id = application_1501548510890_0005).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.2.0
/_/
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_131)
Type in expressions to have them evaluated.
Type :help for more information.
scala> import org.jblas.DoubleMatrix
import org.jblas.DoubleMatrix
scala> :quit
此外,如果您需要通过 Dataproc 的作业提交 API 提交需要“包”的作业,那么由于--packages
实际上是各种 Spark 启动器脚本中的语法糖,而不是 Spark 作业的属性,因此您需要spark.jars.packages
在此类中使用等价物一个案例,正如StackOverflow answer 中所解释的那样。
于 2017-09-01T02:24:08.190 回答