3

目前比较基于 DAG 的工作流工具,如 Airflow 和 Luigi,用于调度通用 docker 容器和 Spark 作业。

Apache Oozie 能否通过其shell操作运行通用 Docker 容器?或者 Oozie 是否严格适用于 Pig 和 Hive 等 Hadoop 工具?

Oozie 与 Hadoop 堆栈的其余部分集成,支持开箱即用的多种 Hadoop 作业(例如 Java map-reduce、Streaming map-reduce、Pig、Hive、Sqoop 和 Distcp)以及系统特定作业(例如Java 程序和 shell 脚本)。

4

1 回答 1

1

我尝试通过 Shell 操作运行 Docker 容器并且它正在工作。由于 Shell 动作可以在集群的任何节点上执行,因此 Docker 必须安装在任何节点上。

从 Hue 创建的 workflow.xml

<workflow-app name="Test docker" xmlns="uri:oozie:workflow:0.5">
    <start to="shell-5c29"/>
    <kill name="Kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <action name="shell-5c29">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <exec>test_docker.sh</exec>
            <file>/test_docker.sh#test_docker.sh</file>
        </shell>
        <ok to="End"/>
        <error to="Kill"/>
    </action>
    <end name="End"/>
</workflow-app>

test_docker.sh

docker run hello-world > output.txt
hdfs dfs -put -f output.txt /output.txt
echo 'done'

生成的 output.txt 的内容

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub (amd64)
 3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/
于 2019-10-31T11:40:24.877 回答