6

我正在尝试获取每个作业使用的 CPU 时间总量。我发现了几个有前途的 sacct 领域,但我应该使用哪一个?

根据文档(https://computing.llnl.gov/linux/slurm/sacct.html),TotalCPU 反映了 SystemCPU 和 UserCPU 的总数,但不反映子进程。但我想要包括子进程在内的总数......

TotalCPU
    The sum of the SystemCPU and UserCPU time used by the job or job step. The total CPU time of the job may exceed the job's elapsed time for jobs that include multiple job steps. The format of the output is identical to that of the elapsed field.

NOTE: TotalCPU provides a measure of the task's parent process and does not include CPU time of child processes.

对于另一个候选人,cputimeraw 没有提供相同级别的详细信息:

cputime
    Formatted number of cpu seconds a process was allocated.

cputimeraw
    How much cpu time process was allocated in second format, not formatted like above. 

我倾向于使用 cputimeraw 而不是 TotalCPU 但我想确保它是总数,包括作业产生的任何子进程。该文档没有以一种或另一种方式指示有关子进程的任何内容。

有没有人有什么建议?

谢谢,

罗伯特

4

1 回答 1

0

以下命令给出了一个很好的总结:

seff jobid

输出:

Job ID: jobid
Cluster: cluster
User/Group: doe/clusterusers
State: TIMEOUT (exit code 0)
Nodes: 6
Cores per node: 28
CPU Utilized: 32-01:15:44
CPU Efficiency: 9.54% of 336-00:44:48 core-walltime
Job Wall-clock time: 2-00:00:16
Memory Utilized: 58.76 GB
Memory Efficiency: 8.74% of 672.00 GB
于 2019-02-05T09:04:35.483 回答