我正在尝试通过 Oozie(通过 Knox)作为 shell 操作提交 Spark 作业(因为我们使用的是 HDP)。
提交停止并在 Yarn 应用程序日志中出现以下错误:
17/11/09 10:58:37 INFO Client: Using the spark assembly jar on HDFS because you are using HDP, defaultSparkAssembly:hdfs://ie-default/hdp/apps/2.6.2.0-205/spark/spark-hdp-assembly.jar
17/11/09 10:58:37 INFO Client: Source and destination file systems are the same. Not copying hdfs://ie-default/hdp/apps/2.6.2.0-205/spark/spark-hdp-assembly.jar
17/11/09 10:58:37 INFO Client: Uploading resource file:/some/path/hadoop/yarn/local/usercache/user0/appcache/application_1509288897709_76202/container_e77_1509288897709_76202_01_000002/spark-ingest.jar
-> hdfs://ie-default/user/user0/.sparkStaging/application_1509288897709_76204/spark-ingest.jar
17/11/09 10:58:37 INFO Client: Deleting staging directory .sparkStaging/application_1509288897709_76204
Exception in thread "main" java.io.FileNotFoundException: File file:/some/path/hadoop/yarn/local/usercache/user0/appcache/application_1509288897709_76202/container_e77_1509288897709_76202_01_000002/spark-ingest.jar
does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:624)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:850)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:614)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:422)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:340)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:292)
at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:344)
at org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:437)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$7.apply(Client.scala:475)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$7.apply(Client.scala:473)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:473)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:778)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1130)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1194)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:750)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.impl.MetricsSystemImpl).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
End of LogType:stderr
LogType:stdout
Log Upload Time:Thu Nov 09 10:58:45 +0100 2017
LogLength:159071
Log Contents:
Oozie Launcher starts
Heart beat
{"properties":[{"key":"oozie.launcher.job.id","value":"job_1509288897709_76202","isFinal":false,"resource":"programatically"},{"key":"oozie.job.id","value":"0023998-171029155940881-oozie-oozi-W","isFinal":false,"resource":"programatically"},{"key":"oozie.action.id","value":"0023998-171029155940881-oozie-oozi-W@spark-submit","isFinal":false,"resource":"programatically"},{"key":"mapreduce.job.tags","value":"oozie-f5763c790ba441dc58ce7950019f32a7","isFinal":false,"resource":"programatically"}]}Starting the execution of prepare actions
Completed the execution of prepare actions successfully
Files in current dir:/some/path/hadoop/yarn/local/usercache/user0/appcache/application_1509288897709_76202/container_e77_1509288897709_76202_01_000002/.
======================
File: .job.xml.crc
File: job.xml
File: .action.xml.crc
File: json-simple-1.1.jar
File: oozie-hadoop-utils-hadoop-2-4.2.0.2.6.2.0-205.jar
File: oozie-sharelib-oozie-4.2.0.2.6.2.0-205.jar
File: aws-java-sdk-core-1.10.6.jar
File: aws-java-sdk-kms-1.10.6.jar
File: okhttp-2.4.0.jar
File: spark-ingest.jar
File: azure-storage-5.3.0.jar
File: propagation-conf.xml
File: okio-1.4.0.jar
File: hadoop-azure-datalake-2.7.3.2.6.2.0-205.jar
File: action.xml
File: jackson-annotations-2.4.0.jar
File: commons-lang3-3.4.jar
File: guava-11.0.2.jar
File: azure-keyvault-core-0.8.0.jar
File: jackson-databind-2.4.4.jar
File: runSparkJob.sh
File: launch_container.sh
File: container_tokens
File: jackson-core-2.4.4.jar
Dir: mr-framework
Dir: hadoop
Dir: tmp
File: hadoop-azure-2.7.3.2.6.2.0-205.jar
File: spark-ingest-1.0-jar-with-dependencies.jar
File: azure-data-lake-store-sdk-2.1.4.jar
File: hadoop-aws-2.7.3.2.6.2.0-205.jar
File: joda-time-2.9.6.jar
File: aws-java-sdk-s3-1.10.6.jar
因此,尽管几行之后它显示了文件夹中存在的文件,但找不到我的应用程序 jar。
我的 Oozie shell 操作如下所示:
<action name="spark-submit" cred="hcat">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>runSparkJob.sh</exec>
<file>${sparkJarPath}#spark-ingest.jar</file>
<file>${SparkJobScriptPath}/${SparkJobScriptName}#runSparkJob.sh</file>
<capture-output/>
</shell>
<ok to="end" />
<error to="kill" />
</action>
runSparkJob.sh
脚本如下所示:
spark-submit --master yarn-cluster --queue default --num-executors 4 --executor-cores 4 --driver-memory 5G --executor-memory 20G --class some.identifier.here.sparkingest.MainSpark spark-ingest.jar
我认为这可能是权限问题,Yarn 看不到文件夹中的文件,但我不知道如何检查。