我想查看我最近在集群上运行的所有作业(已完成、失败和正在运行)。我还希望看到每个工作有 1 个条目。执行sacct
每个作业重新运行 3 行,使用State: FAILED, FAILED, COMPLETED
. 这是什么意思?如何查看我想查看的实际信息?
我也不明白 a JobName
of是什么true
意思。
这是输出的副本:
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
2160852 R interact cluster_u+ 2 COMPLETED 0:0
2160864 R interact cluster_u+ 2 COMPLETED 0:0
2161424 R interact cluster_u+ 2 COMPLETED 0:0
2161430 R interact cluster_u+ 0 CANCELLED+ 0:0
2161431 R interact cluster_u+ 2 COMPLETED 0:0
2161668 R interact cluster_u+ 2 COMPLETED 0:9
2161682 myjob+ general cluster_u+ 2 FAILED 1:0
2161682.bat+ batch cluster_u+ 1 FAILED 1:0
2161682.0 true cluster_u+ 1 COMPLETED 0:0
2161683 myjob+ general cluster_u+ 2 FAILED 1:0
2161683.bat+ batch cluster_u+ 1 FAILED 1:0
2161683.0 true cluster_u+ 1 COMPLETED 0:0
提交脚本(注意 <% %> 中的值由 R 中的包 BatchJobs 处理):
#!/bin/bash
#SBATCH -J <%= job.name %> # name of the job
#SBATCH -p general
#SBATCH --mem <%= resources$memory %> # Memory requirements in Kbytes
#SBATCH -o ./logs/<%= job.name %>_log.txt # Memory requirements in Kbytes
eval "R --vanilla --slave < <%= rscript %>"