通常我会同时提交 200 个左右的工作,并被我想念少数失败的消息及其相关消息qsub
所淹没'completed successfully'
'failed'
我使用什么命令来检索已提交的所有失败作业的列表?
通常我会同时提交 200 个左右的工作,并被我想念少数失败的消息及其相关消息qsub
所淹没'completed successfully'
'failed'
我使用什么命令来检索已提交的所有失败作业的列表?
就像是:
while read line; do
if [ -z "$line" ] ;then
next
elif [ -z "${line//*completed successfully*}" ] ;then
echo The jobs was completed
elif [ -z "${line//*failed*}" ] ;then
echo The jobs has failed
else
echo Doing something with input: "$line"
fi
done < <(qsub <query args line>)
使用此方法,您可以在脚本中创建可使用的变量:
success=() # Using an array to store even more than one result
while read line; do
if [ -z "$line" ] ;then
next
elif [ -z "${line//*completed successfully*}" ] ;then
# Assiming result in the form: The job number: #.* completed successfully
# meaning job number is immediately before the word completed and line
# space separated:
jobnr=${line% completed successfully*}
jobnr=${jobnr##* }
success+=("$jobnr ok")
elif [ -z "${line//*failed*}" ] ;then
jobnr=${line% failed*}
jobnr=${jobnr##* }
success+=("$jobnr failed")
fi
done < <(qsub 20 -cmd -line -args)
printf ": %s\n" "${success[@]}"
qsub ()
{
for ((i=${1:-10}; i--; 1))
do
case $((RANDOM%10)) in
1)
echo The job $i completed successfully.
;;
2)
echo The job $i failed.
;;
*)
echo job $i done...
;;
esac;
done
}
如果您的 qsub 作业与并行运行&
,这是等待作业并查看其中一些是否以失败告终的好方法:
nbf=0
jobs -p|while read; do
wait $REPLY || (( nbf++ ))
done
echo "$nbf jobs ended with failure" >&2
您可以根据需要调整此示例(例如,jobs -p
通过特定作业列表更改输出或打印失败或成功的 PID,...)。
做一些假设:
qsub ... 2>&1 | grep -vi "completed successfully"