输出的前几行hadoop job -list all
是:
X jobs submitted
States are:
Running : 1 Succeded : 2 Failed : 3 Prep : 4
JobId State StartTime UserName Priority SchedulingInfo
输出的行看起来像:
job_201309171413_38136 1 1382455374980 somebody NORMAL 0 running map tasks using 0 map slots. 0 additional slots reserved. 1 running reduce tasks using 1 reduce slots. 0 additional slots reserved.
job_201309171413_37222 2 1382430339635 somebody NORMAL 0 running map tasks using 0 map slots. 0 additional slots reserved. 0 running reduce tasks using 0 reduce slots. 0 additional slots reserved.
第二列是State
工作的。基于标题行、1
手段Running
和2
手段Succeeded
。这不是最清晰的格式:4 行表头,需要参考表头才能弄清楚状态代码的实际含义,并且无法仅获取一项工作的状态。
为特定作业解析此输出的最简单方法是:
$ job_id=job_201309171413_38136
$ hadoop job -list all | awk -v job_id=${job_id} 'BEGIN{OFS="\t"; FS="\t"; final_state="Unknown"} $0 == "States are:" {getline; for(i=1;i<=NF;i++) { split($i,s," "); states[s[3]] = s[1] }} $1==job_id { final_state=states[$2]; exit} END{print final_state}'
Running
$ job_id=job_201309171413_37222
$ hadoop job -list all | awk -v job_id=${job_id} 'BEGIN{OFS="\t"; FS="\t"; final_state="Unknown"} $0 == "States are:" {getline; for(i=1;i<=NF;i++) { split($i,s," "); states[s[3]] = s[1] }} $1==job_id { final_state=states[$2]; exit} END{print final_state}'
Succeeded
$ job_id=foobar
$ hadoop job -list all | awk -v job_id=${job_id} 'BEGIN{OFS="\t"; FS="\t"; final_state="Unknown"} $0 == "States are:" {getline; for(i=1;i<=NF;i++) { split($i,s," "); states[s[3]] = s[1] }} $1==job_id { final_state=states[$2]; exit} END{print final_state}'
Unknown