我开发了一个 spark scala 应用程序并将 log4j 用于记录器,当我使用 spark-submit 执行它时工作正常,如下所示:
spark-submit --name "Test" --class com.comp.test --conf spark.driver.extraJavaOptions='-Dlog4j.configuration=file:/home/myid/log4j.properties' --queue=root.user /home/myid/dev/data.jar
工作正常,我在 log4j.properties 的指定目录中创建了我的日志文件。
现在,当我使用 Oozie spark action 运行相同的操作时,不会创建 log4j.properties 中提到的特定目录中的日志文件。
log4j.properties:
log4j.appender.myConsoleAppender=org.apache.log4j.ConsoleAppender
log4j.appender.myConsoleAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.myConsoleAppender.layout.ConversionPattern=%d [%t] %-5p %c - %m%n
log4j.appender.RollingAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.RollingAppender.File=/home/myid/dev/log/dev.log
log4j.appender.RollingAppender.DatePattern='.'yyyy-MM-dd
log4j.appender.RollingAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.RollingAppender.layout.ConversionPattern=[%p] %d %c %M - %m%n
# By default, everything goes to console and file
log4j.rootLogger=INFO, myConsoleAppender, RollingAppender
# The noisier spark logs go to file only
log4j.logger.spark.storage=INFO, RollingAppender
log4j.additivity.spark.storage=false
log4j.logger.spark.scheduler=INFO, RollingAppender
log4j.additivity.spark.scheduler=false
log4j.logger.spark.CacheTracker=INFO, RollingAppender
log4j.additivity.spark.CacheTracker=false
log4j.logger.spark.CacheTrackerActor=INFO, RollingAppender
log4j.additivity.spark.CacheTrackerActor=false
log4j.logger.spark.MapOutputTrackerActor=INFO, RollingAppender
log4j.additivity.spark.MapOutputTrackerActor=false
log4j.logger.spark.MapOutputTracker=INFO, RollingAppender
log4j.additivty.spark.MapOutputTracker=false
Oozie 工作流程:
<workflow-app name="OozieApp" xmlns="uri:oozie:workflow:0.5">
<start to="LoadTable"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="LoadTable">
<spark xmlns="uri:oozie:spark-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapreduce.job.queuename</name>
<value>root.user</value>
</property>
</configuration>
<master>yarn</master>
<mode>client</mode>
<name>OozieApp</name>
<class>com.comp.test</class>
<jar>data.jar</jar>
<spark-opts>--queue=root.user --conf spark.driver.extraJavaOptions='-Dlog4j.configuration=file:/home/myid/log4j.properties' </spark-opts>
</spark>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
使用 oozie spark action 执行时,您能帮我获取在日志目录中创建的自定义日志吗?
我可以使用 shell 动作并使用 spark-submit ,但我更喜欢 spark 动作本身。