在使用 Spark 完成我的第一步时,我遇到了从应用程序代码提交作业到集群的问题。挖掘日志,我注意到主日志上有一些周期性的 WARN 消息:
15/10/08 13:00:00 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkDriver@192.168.254.167:64014] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
问题是我们的网络上不存在 IP 地址,也没有在任何地方配置。当它尝试执行任务时,工作日志上会显示相同的错误 ip(错误的 ip 传递给 --driver-url):
15/10/08 12:58:21 INFO worker.ExecutorRunner: Launch command: "/usr/java/latest//bin/java" "-cp" "/path/spark/spark-1.5.1-bin-ha
doop2.6/sbin/../conf/:/path/spark/spark-1.5.1-bin-hadoop2.6/lib/spark-assembly-1.5.1-hadoop2.6.0.jar:/path/spark/
spark-1.5.1-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/path/spark/spark-1.5.1-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.ja
r:/path/spark/spark-1.5.1-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/path/hadoop/2.6.0//etc/hadoop/" "-Xms102
4M" "-Xmx1024M" "-Dspark.driver.port=64014" "-Dspark.driver.port=53411" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url"
"akka.tcp://sparkDriver@192.168.254.167:64014/user/CoarseGrainedScheduler" "--executor-id" "39" "--hostname" "192.168.10.214" "--cores" "16" "--app-id" "app-20151008123702-0003" "--worker-url" "akka.tcp://sparkWorker@192.168.10.214:37625/user/Worker"
15/10/08 12:59:28 INFO worker.Worker: Executor app-20151008123702-0003/39 finished with state EXITED message Command exited with code 1 exitStatus 1
知道我做错了什么,如何解决?
Java 版本是 1.8.0_20,我使用的是预构建的 Spark 二进制文件。
谢谢!