1

我正在尝试使用 distcp 将数据从 s3 复制到 hdfs。以下是我正在执行 distcp 的 shell 脚本。

mkdir.sh
hadoop distcp s3n://bucket-name/foldername hdfs://localhost:8020/user/hdfs/data/

The above shell script works fine when i am running the script manually.
But when i try to run the same script using oozie workflow distcp fails.
I am trying to run the workflow using shell-action.

以下是我的 job.properties 文件:

nameNode=hdfs://ip-172-31-34-170.us-west-2.compute.internal:8020
    jobTracker=ip-172-31-34-195.us-west-2.compute.internal:8032
    queueName=default


    oozie.libpath=${nameNode}/user/oozie/share/lib
    user.name=hdfs
    oozie.wf.application.path=${nameNode}/user/${user.name}/oozie/
    mkdirshellscript=${oozie.wf.application.path}/mkdir.sh

我的workflow.xml如下:

<workflow-app name="WorkFlowForShellAction" xmlns="uri:oozie:workflow:0.1">
         <start to="shellAction"/>
         <action name="shellAction">
         <shell xmlns="uri:oozie:shell-action:0.1">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <prepare>
        <delete path="/user/hdfs/hari123"/>
       <mkdir path="/user/hdfs/hari123"/>
       </prepare>
       <configuration>


     <property>


           <name>mapred.job.queue.name</name>


<value>${queueName}</value>
    </property>
    </configuration>
    <exec>${mkdirshellscript}</exec>
    <file>${mkdirshellscript}</file>
    </shell>
    <ok to="end"/>
    <error to="killAction"/>
    </action>
    <kill name="killAction">
    <message>"Killed job due to error"</message>
    </kill>
    <end name="end"/>
    </workflow-app>

oozie 日志如下:

2014-09-30 10:31:51,102 INFO org.apache.oozie.servlet.CallbackServlet: SERVER[ec2-54-69-26-119.us-west-2.compute.amazonaws.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000018-140930055823135-oozie-oozi-W] ACTION[0000018-140930055823135-oozie-oozi-W@shellAction] callback for action [0000018-140930055823135-oozie-oozi-W@shellAction]
2014-09-30 10:31:51,337 INFO org.apache.oozie.command.wf.ActionEndXCommand: SERVER[ec2-54-69-26-119.us-west-2.compute.amazonaws.com] USER[hdfs] GROUP[-] TOKEN[] APP[WorkFlowForShellActionWithCaptureOutput] JOB[0000018-140930055823135-oozie-oozi-W] ACTION[0000018-140930055823135-oozie-oozi-W@shellAction] ERROR is considered as FAILED for SLA

我想在oozie中使用shell-action而不是distcp-action来做distcp。

4

1 回答 1

0

尝试:

<workflow-app name="WorkFlowForShellAction" xmlns="uri:oozie:workflow:0.1">
...
     <start to="shellAction"/>
     <action name="shellAction">
     <shell xmlns="uri:oozie:shell-action:0.1">
     <job-tracker>${jobTracker}</job-tracker>
     <name-node>${nameNode}</name-node>
     <prepare>
       <delete path="/user/hdfs/hari123"/>
       <mkdir path="/user/hdfs/hari123"/>
     </prepare>
     <configuration>
       <property>
         <name>mapred.job.queue.name</name>
         <value>${queueName}</value>
       </property>
     </configuration>
     <exec>./${mkdirshellscript}</exec>
     <file>${mkdirshellscript}#${mkdirshellscript}</file>
     </shell>
...
</workflow-app>
于 2014-10-08T12:47:24.240 回答