3

我正在尝试在 cloudera 快速启动 VM 上设置 Spark 1.2.1 后启动 spark-shell。我收到以下错误。寻求解决此问题的帮助。感谢您对此问题的任何快速帮助。错误日志如下所述:

16/03/03 09:40:37 INFO EventLoggingListener: Logging events to hdfs://quickstart.cloudera:8020/user/spark/applicationHistory/local-1457026830824
org.apache.spark.SparkException: spark.dynamicAllocation.{min/max}Executors must be set!
    at org.apache.spark.ExecutorAllocationManager.validateSettings(ExecutorAllocationManager.scala:135)
    at org.apache.spark.ExecutorAllocationManager.<init>(ExecutorAllocationManager.scala:98)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:377)
    at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
    at $iwC$$iwC.<init>(<console>:9)
    at $iwC.<init>(<console>:18)
    at <init>(<console>:20)
    at .<init>(<console>:24)
    at .<clinit>(<console>)
    at .<init>(<console>:7)
    at .<clinit>(<console>)
    at $print(<console>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:852)
    at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1125)
    at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:674)
    at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:705)
    at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:669)
    at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:828)
    at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:873)
    at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:785)
    at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:123)
    at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:122)
    at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:270)
    at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:122)
    at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:60)
    at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:945)
    at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:147)
    at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:60)
    at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:106)
    at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:60)
    at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:962)
    at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
    at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
    at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
    at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:916)
    at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1011)
    at org.apache.spark.repl.Main$.main(Main.scala:31)
    at org.apache.spark.repl.Main.main(Main.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


scala> 
4

1 回答 1

3

例外很明显。您似乎已将spark.dynamicAllocation.enabled属性设置为true,但未能设置spark.dynamicAllocation.minExecutorsspark.dynamicAllocation.maxExecutorsspark 1.2.1 文档清楚地说明了这一点(来自描述spark.dynamicAllocation.enabled,强调我的):

这需要设置以下配置: spark.dynamicAllocation.minExecutorsspark.dynamicAllocation.maxExecutorsspark.shuffle.service.enabled

如果您查看Spark 的 1.2 分支,您会发现如果您不指定这些值,则默认值为 -1:

// Lower and upper bounds on the number of executors. These are required.
private val minNumExecutors = conf.getInt("spark.dynamicAllocation.minExecutors", -1)
private val maxNumExecutors = conf.getInt("spark.dynamicAllocation.maxExecutors", -1)

这种行为已经改变。如果您查看更新的 Spark 1.6 分支,您会发现它们分别遵循0Integer.MAX_VALUE

// Lower and upper bounds on the number of executors.
private val minNumExecutors = conf.getInt("spark.dynamicAllocation.minExecutors", 0)
private val maxNumExecutors = conf.getInt("spark.dynamicAllocation.maxExecutors", 
                                           Integer.MAX_VALUE)

这只是意味着,您需要将这些添加到SparkConf设置中,或者添加到您提供给 spark-shell 的任何其他配置文件中:

val sparkConf = new SparkConf()
  .set("spark.dynamicAllocation.minExecutors", minExecutors)
  .set("spark.dynamicAllocation.maxExecutors", maxExecutors)
于 2016-03-03T20:31:56.470 回答