0

通常我会同时提交 200 个左右的工作,并被我想念少数失败的消息及其相关消息qsub所淹没'completed successfully''failed'

我使用什么命令来检索已提交的所有失败作业的列表?

4

3 回答 3

1

就像是:

while read line; do
    if [ -z "$line" ] ;then
        next
    elif [ -z "${line//*completed successfully*}" ] ;then
        echo The jobs was completed
    elif [ -z "${line//*failed*}" ] ;then
        echo The jobs has failed
    else
        echo Doing something with input: "$line"
    fi
done < <(qsub <query args line>)

使用此方法,您可以在脚本中创建可使用的变量:

success=()  # Using an array to store even more than one result
while read line; do
    if [ -z "$line" ] ;then
        next
    elif [ -z "${line//*completed successfully*}" ] ;then
        # Assiming result in the form: The job number: #.* completed successfully
        # meaning job number is immediately before the word completed and line
        # space separated:
        jobnr=${line% completed successfully*}
        jobnr=${jobnr##* }
        success+=("$jobnr ok")
    elif [ -z "${line//*failed*}" ] ;then
        jobnr=${line% failed*}
        jobnr=${jobnr##* }
        success+=("$jobnr failed")
    fi
done < <(qsub 20 -cmd -line -args)
printf ": %s\n" "${success[@]}"

对此进行了测试:

qsub () 
{ 
  for ((i=${1:-10}; i--; 1))
  do
    case $((RANDOM%10)) in 
        1)
            echo The job $i completed successfully.
        ;;
        2)
            echo The job $i failed.
        ;;
        *)
            echo job $i done...
        ;;
    esac;
  done
}
于 2013-11-09T14:22:56.780 回答
0

如果您的 qsub 作业与并行运行&,这是等待作业并查看其中一些是否以失败告终的好方法:

nbf=0
jobs -p|while read; do
    wait $REPLY || (( nbf++ ))
done
echo "$nbf jobs ended with failure" >&2

您可以根据需要调整此示例(例如,jobs -p通过特定作业列表更改输出或打印失败或成功的 PID,...)。

于 2013-11-05T20:10:52.797 回答
-1

做一些假设:

qsub ... 2>&1 | grep -vi "completed successfully"
于 2013-10-02T16:34:44.400 回答