0

I am working in the process of automating Hive Jobs using Oozie scipts.
In workflow.xml, I am able to get the values from powershell script file (Oozie job script).
Where in hql file I am unable to get the values defined in powershell script file (Oozie job script).

powershell script file:

$hiveScript = "$storageUri/Oozie/input/useooziewf.hql"
#$hiveScript = "$storageUri/Oozie/input/"
$hiveTableName = "log4jlogs"
$hiveDataFolder = "$storageUri"
$hiveOutputFolder = "$storageUri/OozieOutput"
$passwd = ConvertTo-SecureString $clusterPassword -AsPlainText -Force
$creds = New-Object System.Management.Automation.PSCredential ($clusterUsername, $passwd)
Use-AzureHDInsightCluster $clusterName


$OoziePayload =  @"
<?xml version="1.0" encoding="UTF-8"?>
<configuration>

   <property>
       <name>nameNode</name>
       <value>$storageUri</value>
   </property>

   <property>
       <name>jobTracker</name>
       <value>jobtrackerhost:9010</value>
   </property>

   <property>
       <name>queueName</name>
       <value>default</value>
   </property>

   <property>
       <name>oozie.use.system.libpath</name>
       <value>true</value>
   </property>

   <property>
       <name>hiveScript</name>
       <value>$hiveScript</value>
   </property>

   <property>
       <name>hiveTableName</name>
       <value>$hiveTableName</value>
   </property>

   <property>
       <name>hiveDataFolder</name>
       <value>$hiveDataFolder</value>
   </property>

   <property>
       <name>hiveOutputFolder</name>
       <value>$hiveOutputFolder</value>
   </property>

   <property>
       <name>user.name</name>
       <value>admin</value>
   </property>

   <property>
       <name>oozie.wf.application.path</name>
       <value>$oozieWFPath</value>
   </property>

</configuration>
"@

Starting the Oozie job like below:

# create Oozie job
Write-Host "Sending the following Payload to the cluster:" -ForegroundColor Green
Write-Host "`n--------`n$OoziePayload`n--------"
$clusterUriCreateJob = "https://$clusterName.azurehdinsight.net:443/oozie/v1/jobs"
$response = Invoke-RestMethod -Method Post -Uri $clusterUriCreateJob -Credential $creds -Body $OoziePayload -ContentType "application/xml" -OutVariable $OozieJobName #-debug

$jsonResponse = ConvertFrom-Json (ConvertTo-Json -InputObject $response)
$oozieJobId = $jsonResponse[0].("id")
#Write-Host "Oozie job id is $oozieJobId..."

# start Oozie job
Write-Host "Starting the Oozie job $oozieJobId..." -ForegroundColor Green
$clusterUriStartJob = "https://$clusterName.azurehdinsight.net:443/oozie/v1/job/" + $oozieJobId + "?action=start"
$response = Invoke-RestMethod -Method Put -Uri $clusterUriStartJob -Credential $creds | Format-Table -HideTableHeaders #-debug

Hive Job (hql file):

DROP TABLE ${hiveTableName};
CREATE EXTERNAL TABLE ${hiveTableName}(t1 string, t2 string, t3 string, t4 string, t5 string, t6 string, t7 string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE LOCATION '${hiveDataFolder}';
4

2 回答 2

1

假设它$oozieWFPath引用了现有的工作流 xml,您可以尝试将参数添加到 Hive 操作:

<action name="myhiveaction">
    <hive xmlns="uri:oozie:hive-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <script>${hiveScript}</script>
        <param>hiveTableName=${hiveTableName}</param>
        <param>hiveDataFolder=${hiveDataFolder}</param>
    </hive>
    ...
</action>

最后两个参数节点应该将 Oozie 变量传递给 Hive 脚本。

于 2014-04-24T18:30:52.333 回答
0

您可以在通过 HDInsight 使用 Oozie 中找到示例,http: //azure.microsoft.com/en-us/documentation/articles/hdinsight-use-oozie/ 。更多 HDInsight 文章可在http://azure.microsoft.com/en-us/documentation/services/hdinsight/找到。

于 2014-04-25T12:20:43.010 回答