0

我试图从这两个(thisthis)线程中解决它,它在我自己的虚拟机上对我有用,但在云 dataproc 中不起作用。我对他们两个都做了同样的过程。但是云端仍然存在错误,与之前虚拟机中的错误相同。云上应该怎么做才能解决?错误截图

4

1 回答 1

1

您是否在这些链接的线程中执行了完整的“git clone”步骤?您是否需要实际修改 jblas?如果没有,您应该使用--packages org.jblas:jblas:1.2.4不使用git cloneor的方式将它们从 maven Central 中拉出mvn install;以下对我来说在新的 Dataproc 集群上运行良好:

$ spark-shell --packages org.jblas:jblas:1.2.4
Ivy Default Cache set to: /home/dhuo/.ivy2/cache
The jars for the packages stored in: /home/dhuo/.ivy2/jars
:: loading settings :: url = jar:file:/usr/lib/spark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.jblas#jblas added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
  confs: [default]
  found org.jblas#jblas;1.2.4 in central
downloading https://repo1.maven.org/maven2/org/jblas/jblas/1.2.4/jblas-1.2.4.jar ...
  [SUCCESSFUL ] org.jblas#jblas;1.2.4!jblas.jar (605ms)
:: resolution report :: resolve 713ms :: artifacts dl 608ms
  :: modules in use:
  org.jblas#jblas;1.2.4 from central in [default]
  ---------------------------------------------------------------------
  |                  |            modules            ||   artifacts   |
  |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
  ---------------------------------------------------------------------
  |      default     |   1   |   1   |   1   |   0   ||   1   |   1   |
  ---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
  confs: [default]
  1 artifacts copied, 0 already retrieved (10360kB/29ms)
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
ivysettings.xml file not found in HIVE_HOME or HIVE_CONF_DIR,/etc/hive/conf.dist/ivysettings.xml will be used
Spark context Web UI available at http://10.240.2.221:4040
Spark context available as 'sc' (master = yarn, app id = application_1501548510890_0005).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.2.0
      /_/

Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_131)
Type in expressions to have them evaluated.
Type :help for more information.

scala> import org.jblas.DoubleMatrix
import org.jblas.DoubleMatrix

scala> :quit

此外,如果您需要通过 Dataproc 的作业提交 API 提交需要“包”的作业,那么由于--packages实际上是各种 Spark 启动器脚本中的语法糖,而不是 Spark 作业的属性,因此您需要spark.jars.packages在此类中使用等价物一个案例,正如StackOverflow answer 中所解释的那样

于 2017-09-01T02:24:08.190 回答