我使用此处提到的步骤将 Spark-0.9.1 设置为在 mesos-0.13.0 上运行。Mesos UI 显示两个已注册的工作人员。我想在 Spark-shell 上运行这些命令
> scala> val data = 1 to 10000 data:
> scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5, 6,
> 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
> 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
> 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
> 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75,
> 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92,
> 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107,
> 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121,
> 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135,
> 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149,
> 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163,
> 164, 165, 166, 167, 168, 169, 170...
> scala> val distData = sc.parallelize(data) distData:
> org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at
> parallelize at <console>:14
现在当我运行 collect 方法时,会出现以下错误。
> scala> distData.filter(_< 10).collect()
14/06/03 19:54:55 INFO SparkContext: Starting job: collect at <console>:17
14/06/03 19:54:55 INFO DAGScheduler: Got job 0 (collect at <console>:17) with 8 output partitions (allowLocal=false)
14/06/03 19:54:55 INFO DAGScheduler: Final stage: Stage 0 (collect at <console>:17)
14/06/03 19:54:55 INFO DAGScheduler: Parents of final stage: List()
14/06/03 19:54:55 INFO DAGScheduler: Missing parents: List()
14/06/03 19:54:55 INFO DAGScheduler: Submitting Stage 0 (FilteredRDD[1] at filter at <console>:17), which has no missing parents
14/06/03 19:54:55 INFO DAGScheduler: Submitting 8 missing tasks from Stage 0 (FilteredRDD[1] at filter at <console>:17)
14/06/03 19:54:55 INFO TaskSchedulerImpl: Adding task set 0.0 with 8 tasks
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:0 as TID 0 on executor 201406031732-3213994176-5050-6320-11: host-DSRV05.host (PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:0 as 1338 bytes in 8 ms
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:1 as TID 1 on executor 201406031732-3213994176-5050-6320-10: host-DSRV04.host (PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:1 as 1338 bytes in 0 ms
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:2 as TID 2 on executor 201406031732-3213994176-5050-6320-11: host-DSRV05.host (PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:2 as 1338 bytes in 0 ms
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:3 as TID 3 on executor 201406031732-3213994176-5050-6320-10: host-DSRV04.host (PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:3 as 1338 bytes in 1 ms
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:4 as TID 4 on executor 201406031732-3213994176-5050-6320-11: host-DSRV05.host (PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:4 as 1338 bytes in 0 ms
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:5 as TID 5 on executor 201406031732-3213994176-5050-6320-10: host-DSRV04.host (PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:5 as 1338 bytes in 0 ms
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:6 as TID 6 on executor 201406031732-3213994176-5050-6320-11: host-DSRV05.host (PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:6 as 1338 bytes in 0 ms
14/06/03 19:54:55 INFO TaskSetManager: Starting task 0.0:7 as TID 7 on executor 201406031732-3213994176-5050-6320-10: host-DSRV04.host (PROCESS_LOCAL)
14/06/03 19:54:55 INFO TaskSetManager: Serialized task 0.0:7 as 1338 bytes in 0 ms
14/06/03 19:54:56 INFO TaskSetManager: Re-queueing tasks for 201406031732-3213994176-5050-6320-10 from TaskSet 0.0
14/06/03 19:54:56 WARN TaskSetManager: Lost TID 5 (task 0.0:5)
14/06/03 19:54:56 WARN TaskSetManager: Lost TID 7 (task 0.0:7)
14/06/03 19:54:56 WARN TaskSetManager: Lost TID 1 (task 0.0:1)
14/06/03 19:54:56 WARN TaskSetManager: Lost TID 3 (task 0.0:3)
14/06/03 19:54:56 INFO DAGScheduler: Executor lost: 201406031732-3213994176-5050-6320-10 (epoch 0)
14/06/03 19:54:56 INFO BlockManagerMasterActor: Trying to remove executor 201406031732-3213994176-5050-6320-10 from BlockManagerMaster.
14/06/03 19:54:56 INFO BlockManagerMaster: Removed 201406031732-3213994176-5050-6320-10 successfully in removeExecutor
14/06/03 19:54:56 INFO TaskSetManager: Starting task 0.0:3 as TID 8 on executor 201406031732-3213994176-5050-6320-11: host-DSRV05.host (PROCESS_LOCAL)
14/06/03 19:54:56 INFO TaskSetManager: Serialized task 0.0:3 as 1338 bytes in 0 ms
14/06/03 19:54:56 INFO DAGScheduler: Host gained which was in lost list earlier: host-DSRV04.host
14/06/03 19:54:56 INFO TaskSetManager: Starting task 0.0:1 as TID 9 on executor 201406031732-3213994176-5050-6320-10: host-DSRV04.host (PROCESS_LOCAL)
14/06/03 19:54:56 INFO TaskSetManager: Serialized task 0.0:1 as 1338 bytes in 0 ms
14/06/03 19:54:56 INFO TaskSetManager: Starting task 0.0:7 as TID 10 on executor 201406031732-3213994176-5050-6320-11: host-DSRV05.host (PROCESS_LOCAL)
14/06/03 19:54:56 INFO TaskSetManager: Serialized task 0.0:7 as 1338 bytes in 0 ms
14/06/03 19:54:56 INFO TaskSetManager: Starting task 0.0:5 as TID 11 on executor 201406031732-3213994176-5050-6320-10: host-DSRV04.host (PROCESS_LOCAL)
14/06/03 19:54:56 INFO TaskSetManager: Serialized task 0.0:5 as 1338 bytes in 0 ms
14/06/03 19:54:57 INFO TaskSetManager: Re-queueing tasks for 201406031732-3213994176-5050-6320-11 from TaskSet 0.0
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 8 (task 0.0:3)
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 2 (task 0.0:2)
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 4 (task 0.0:4)
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 10 (task 0.0:7)
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 6 (task 0.0:6)
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 0 (task 0.0:0)
14/06/03 19:54:57 INFO DAGScheduler: Executor lost: 201406031732-3213994176-5050-6320-11 (epoch 1)
14/06/03 19:54:57 INFO BlockManagerMasterActor: Trying to remove executor 201406031732-3213994176-5050-6320-11 from BlockManagerMaster.
14/06/03 19:54:57 INFO BlockManagerMaster: Removed 201406031732-3213994176-5050-6320-11 successfully in removeExecutor
14/06/03 19:54:57 INFO DAGScheduler: Host gained which was in lost list earlier: host-DSRV05.host
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:0 as TID 12 on executor 201406031732-3213994176-5050-6320-11: host-DSRV05.host (PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:0 as 1338 bytes in 1 ms
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:6 as TID 13 on executor 201406031732-3213994176-5050-6320-10: host-DSRV04.host (PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:6 as 1338 bytes in 0 ms
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:7 as TID 14 on executor 201406031732-3213994176-5050-6320-11: host-DSRV05.host (PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:7 as 1338 bytes in 1 ms
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:4 as TID 15 on executor 201406031732-3213994176-5050-6320-10: host-DSRV04.host (PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:4 as 1338 bytes in 0 ms
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:2 as TID 16 on executor 201406031732-3213994176-5050-6320-11: host-DSRV05.host (PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:2 as 1338 bytes in 0 ms
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:3 as TID 17 on executor 201406031732-3213994176-5050-6320-10: host-DSRV04.host (PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:3 as 1338 bytes in 1 ms
14/06/03 19:54:57 INFO TaskSetManager: Re-queueing tasks for 201406031732-3213994176-5050-6320-11 from TaskSet 0.0
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 14 (task 0.0:7)
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 16 (task 0.0:2)
14/06/03 19:54:57 WARN TaskSetManager: Lost TID 12 (task 0.0:0)
14/06/03 19:54:57 INFO DAGScheduler: Executor lost: 201406031732-3213994176-5050-6320-11 (epoch 2)
14/06/03 19:54:57 INFO BlockManagerMasterActor: Trying to remove executor 201406031732-3213994176-5050-6320-11 from BlockManagerMaster.
14/06/03 19:54:57 INFO BlockManagerMaster: Removed 201406031732-3213994176-5050-6320-11 successfully in removeExecutor
14/06/03 19:54:57 INFO DAGScheduler: Host gained which was in lost list earlier: host-DSRV05.host
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:0 as TID 18 on executor 201406031732-3213994176-5050-6320-11: host-DSRV05.host (PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:0 as 1338 bytes in 0 ms
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:2 as TID 19 on executor 201406031732-3213994176-5050-6320-11: host-DSRV05.host (PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:2 as 1338 bytes in 0 ms
14/06/03 19:54:57 INFO TaskSetManager: Starting task 0.0:7 as TID 20 on executor 201406031732-3213994176-5050-6320-11: host-DSRV05.host (PROCESS_LOCAL)
14/06/03 19:54:57 INFO TaskSetManager: Serialized task 0.0:7 as 1338 bytes in 0 ms
14/06/03 19:54:58 INFO TaskSetManager: Re-queueing tasks for 201406031732-3213994176-5050-6320-10 from TaskSet 0.0
14/06/03 19:54:58 WARN TaskSetManager: Lost TID 17 (task 0.0:3)
14/06/03 19:54:58 WARN TaskSetManager: Lost TID 11 (task 0.0:5)
14/06/03 19:54:58 WARN TaskSetManager: Lost TID 13 (task 0.0:6)
14/06/03 19:54:58 WARN TaskSetManager: Lost TID 9 (task 0.0:1)
14/06/03 19:54:58 WARN TaskSetManager: Lost TID 15 (task 0.0:4)
14/06/03 19:54:58 INFO DAGScheduler: Executor lost: 201406031732-3213994176-5050-6320-10 (epoch 3)
14/06/03 19:54:58 INFO BlockManagerMasterActor: Trying to remove executor 201406031732-3213994176-5050-6320-10 from BlockManagerMaster.
14/06/03 19:54:58 INFO BlockManagerMaster: Removed 201406031732-3213994176-5050-6320-10 successfully in removeExecutor
14/06/03 19:54:58 INFO DAGScheduler: Host gained which was in lost list earlier: host-DSRV04.host
14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:4 as TID 21 on executor 201406031732-3213994176-5050-6320-11: host-DSRV05.host (PROCESS_LOCAL)
14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:4 as 1338 bytes in 0 ms
14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:1 as TID 22 on executor 201406031732-3213994176-5050-6320-10: host-DSRV04.host (PROCESS_LOCAL)
14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:1 as 1338 bytes in 0 ms
14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:6 as TID 23 on executor 201406031732-3213994176-5050-6320-11: host-DSRV05.host (PROCESS_LOCAL)
14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:6 as 1338 bytes in 0 ms
14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:5 as TID 24 on executor 201406031732-3213994176-5050-6320-10: host-DSRV04.host (PROCESS_LOCAL)
14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:5 as 1338 bytes in 1 ms
14/06/03 19:54:58 INFO TaskSetManager: Starting task 0.0:3 as TID 25 on executor 201406031732-3213994176-5050-6320-10: host-DSRV04.host (PROCESS_LOCAL)
14/06/03 19:54:58 INFO TaskSetManager: Serialized task 0.0:3 as 1338 bytes in 0 ms
14/06/03 19:54:59 INFO TaskSetManager: Re-queueing tasks for 201406031732-3213994176-5050-6320-11 from TaskSet 0.0
14/06/03 19:54:59 WARN TaskSetManager: Lost TID 23 (task 0.0:6)
14/06/03 19:54:59 WARN TaskSetManager: Lost TID 20 (task 0.0:7)
14/06/03 19:54:59 ERROR TaskSetManager: Task 0.0:7 failed 4 times; aborting job
14/06/03 19:54:59 INFO DAGScheduler: Failed to run collect at <console>:17
14/06/03 19:54:59 INFO DAGScheduler: Executor lost: 201406031732-3213994176-5050-6320-11 (epoch 4)
14/06/03 19:54:59 INFO BlockManagerMasterActor: Trying to remove executor 201406031732-3213994176-5050-6320-11 from BlockManagerMaster.
14/06/03 19:54:59 INFO BlockManagerMaster: Removed 201406031732-3213994176-5050-6320-11 successfully in removeExecutor
14/06/03 19:54:59 INFO DAGScheduler: Host gained which was in lost list earlier: host-DSRV05.host
org.apache.spark.SparkException: Job aborted: Task 0.0:7 failed 4 times (most recent failure: unknown)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>
>
> scala> 14/06/03 19:55:00 INFO TaskSetManager: Re-queueing tasks for
> 201406031732-3213994176-5050-6320-10 from TaskSet 0.0 14/06/03
> 19:55:00 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have
> all completed, from pool 14/06/03 19:55:00 INFO DAGScheduler: Executor
> lost: 201406031732-3213994176-5050-6320-10 (epoch 5) 14/06/03 19:55:00
> INFO BlockManagerMasterActor: Trying to remove executor
> 201406031732-3213994176-5050-6320-10 from BlockManagerMaster. 14/06/03
> 19:55:00 INFO BlockManagerMaster: Removed
> 201406031732-3213994176-5050-6320-10 successfully in removeExecutor
> 14/06/03 19:55:00 INFO DAGScheduler: Host gained which was in lost
> list earlier: host-DSRV04.host
我已经多次检查了我的 spark 配置,对我来说它看起来不错。任何想法可能出了什么问题?
- 谢谢