cpu-usage - 多核/超线程上的 CPU 时间

Question

我需要观察多核/超线程中的进程占用的 CPU 时间。假设一个 Xeon、Opteron 等。

假设我有 4 个内核，超线程，这意味着 8 个“虚拟”内核。让 X 我要运行的程序观察它花费了多少 CPU 时间。

如果我在我的 cpu 中运行进程 X，我得到 CPU 时间 A。假设 A 超过 5 分钟。
如果我运行同一个进程 X 的 8 个副本，我将获得 CPU 时间 B1、B2……、B8。
如果我运行同一个进程 X 的 7 个副本，我将获得 CPU 时间 C1、C2……、C7。
如果我运行同一个进程 X 的 4 个副本，我将获得 CPU 时间 D1、D2……、D4。

问题：

数字A，Bi，Ci，Di之间的关系是什么？
A比Bi小吗？多少钱？慈、迪呢？
他们之间的时间Bi不同吗？慈、迪呢？

score 1 · Accepted Answer

数字A，Bi，Ci，Di之间的关系是什么？

Expect D1=D2=D3=D4=A*1，除非您有 L2 缓存问题（冲突、故障等），在这种情况下，您的数字会稍大一些，而不是 1。

期待B1=B2=B3=B4=...=B8=A*1.3。数量1.3可能因您的应用程序1.1而2异（某些处理器子部件是超线程的，其他则不是）。根据一个私人论坛，它是根据类似的统计数据计算得出的，我在这里使用问题的符号：D=23 秒，A=18 秒。无线程进程在没有输入/输出的情况下进行整数计算。确切的应用是检查 Motivic Steenrod 代数中的 Adem 系数（不知道它是什么；设置为 (2n+e,n)，n=20）。

在有七个进程 (Cs) 的情况下，如果您将每个进程分配给一个核心（在 linux 上使用 /usr/bin/htop），那么您将拥有一个执行时间相同的进程（例如 C5） A 和其他（在我的示例中为 C1、C2、C3、C4、C6、C7）将具有与 Ds 相同的值。如果您不将进程分配给核心，并且您的进程持续时间足够让操作系统在核心之间平衡它们，它们将收敛到 C 的平均值。

Are times Bi different between them? What about Ci, Di?

取决于您的操作系统调度程序及其配置。而且from linux显示的百分比/bin/top是作弊，A、Bs、Cs和Ds会显示接近100%。

要评估性能，不要忘记/usr/bin/nettop（以及变体 nethogs、nmon、iftop、iptraf）、iotop（以及变体 iostat、latencytop）和 collectl (+colmux) 和 sar (+sag, +sadf) .

score 0 · Accepted Answer

As 2021, there could be high variations when running multiple experiments. For instance, over 50% of difference.

Two gold standards:

Run in single-core mode
Disabling hyperthreading.

For detecting the issue:

Run the same algorithm multiple times.

In theory this could be used when running experiments:

Run each experiment k times.

However, this is incomplete when comparing running time as a group of K could in conditions non-comparable with other K experiments.

To alleviate that:

Run each experiment k times.
Randomize the order of the experiments.

For publication purposes, that's not enough but it might be useful for fast turn-around, even with k = 2.

H/T: discussion in the slack space of the planning community, related to the conference ICAPS: https://www.icaps-conference.org

cpu-usage - 多核/超线程上的 CPU 时间

2 回答 2

Related

Reference