4

我们刚刚交付了功能强大的 32 核 AMD Opteron 服务器,容量为 128Gb。我们有 2 个 6272 CPU,每个 CPU 有 16 个内核。我们正在 30 个线程上运行一个长时间运行的大型 java 任务。我们开启了针对 Linux 和 java 的 NUMA 优化。我们的 Java 线程主要使用该线程私有的对象,有时会读取其他线程将要读取的内存,并且非常偶尔会写入或锁定共享对象。

我们无法解释为什么 CPU 内核有 25% 空闲。以下是“top”的转储:

顶部 - 23:06:38 上升 1 天,23 分钟,3 个用户,平均负载:10.84、10.27、9.62
任务:总共 676 个,1 个正在运行,675 个正在睡觉,0 个停止,0 个僵尸
中央处理器:64.5%us、1.3%sy、0.0%ni、32.9%id、1.3%wa、0.0%hi、0.0%si、0.0%st
内存:总共 132138168k,已使用 131652664k,可用 485504k,92340k 缓冲区
交换:总计 5701624k,已使用 230252k,免费 5471372k,缓存 13444344k
...
top - 22:37:39 up 23:54, 3 个用户,平均负载:7.83, 8.70, 9.27
任务:总共 678 个,1 个正在运行,677 个正在睡觉,0 个停止,0 个僵尸
CPU0:75.8%us,2.0%sy,0.0%ni,22.2%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU1:77.2%us,1.3%sy,0.0%ni,21.5%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU2:77.3%us,1.0%sy,0.0%ni,21.7%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU3:77.8%us,1.0%sy,0.0%ni,21.2%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU4:76.9%us,2.0%sy,0.0%ni,21.1%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU5:76.3%us,2.0%sy,0.0%ni,21.7%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU6:12.6%us,3.0%sy,0.0%ni,84.4%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU7:8.6%us,2.0%sy,0.0%ni,89.4%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU8:77.0%us,2.0%sy,0.0%ni,21.1%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU9:77.0%us,2.0%sy,0.0%ni,21.1%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU10:77.6%us,1.7%sy,0.0%ni,20.8%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU11:75.7%us,2.0%sy,0.0%ni,21.4%id,1.0%wa,0.0%hi,0.0%si,0.0%st
CPU12:76.6%us,2.3%sy,0.0%ni,21.1%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU13:76.6%us,2.3%sy,0.0%ni,21.1%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU14:76.2%us,2.6%sy,0.0%ni,15.9%id,5.3%wa,0.0%hi,0.0%si,0.0%st
CPU15:76.6%us,2.0%sy,0.0%ni,21.5%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU16:73.6%us,2.6%sy,0.0%ni,23.8%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU17:74.5%us,2.3%sy,0.0%ni,23.2%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU18:73.9%us,2.3%sy,0.0%ni,23.8%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU19:72.9%us,2.6%sy,0.0%ni,24.4%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU20:72.8%us,2.6%sy,0.0%ni,24.5%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU21:72.7%us,2.3%sy,0.0%ni,25.0%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU22:72.5%us,2.6%sy,0.0%ni,24.8%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU23:73.0%us,2.3%sy,0.0%ni,24.7%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU24:74.7%us,2.7%sy,0.0%ni,22.7%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU25:74.5%us,2.6%sy,0.0%ni,22.8%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU26:73.7%us,2.0%sy,0.0%ni,24.3%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU27:74.1%us,2.3%sy,0.0%ni,23.6%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU28:74.1%us,2.3%sy,0.0%ni,23.6%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU29:74.0%us,2.0%sy,0.0%ni,24.0%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU30:73.2%us,2.3%sy,0.0%ni,24.5%id,0.0%wa,0.0%hi,0.0%si,0.0%st
CPU31:73.1%us,2.0%sy,0.0%ni,24.9%id,0.0%wa,0.0%hi,0.0%si,0.0%st
内存:总共 132138168k,已使用 131711704k,可用 426464k,88336k 缓冲区
交换:总计 5701624k,已使用 229572k,免费 5472052k,缓存 13745596k

  PID 用户 PR NI VIRT RES SHR S %CPU %MEM TIME+ 命令
13865 根 20 0 122g 112g 3.1g S 2334.3 89.6 20726:49 java
27139 杰恩 20 0 15428 1728 952 S 2.6 0.0 0:04.21 顶部
27161 系统管理员 20 0 15428 1712 940 R 1.0 0.0 0:00.28 顶部
   33 根 20 0 0 0 0 S 0.3 0.0 0:06.24 ksoftirqd/7
  131 根 20 0 0 0 0 S 0.3 0.0 0:09.52 事件/0
 1858 根 20 0 0 0 0 S 0.3 0.0 1:35.14 kondemand/0

Java 堆栈的转储确认没有任何线程位于使用锁的少数地方附近,也没有任何靠近任何磁盘或网络 i/o 的地方。

我很难找到关于“空闲”与“等待”的“顶部”含义的清晰解释,但我得到的印象是“空闲”意味着“没有更多需要运行的线程”,但这在我们的案例。我们正在使用“Executors.newFixedThreadPool(30)”。有大量待处理的任务,每个任务持续 10 秒左右。

我怀疑解释需要对NUMA有很好的理解。当 CPU 等待非本地访问时,您会看到“空闲”状态吗?如果不是,那么解释是什么?

4

1 回答 1

1

这可能是很多事情:

  • 这可能是线程之间对共享数据的访问权的争用。这可能采取锁争用的形式,或者由于读或写障碍导致的额外内存流量,尽管后者不太可能产生这些症状。

  • 您正在泄漏工作线程;例如,它们偶尔会死去而没有被替换。

  • 执行程序本身可能存在瓶颈;例如,它可能对通过安排下一个任务完成的任务响应不够快。

  • 瓶颈可能是垃圾收集器,特别是如果您没有启用并行收集。


本页讨论了 Java 的 NUMA 增强,并提到了 NUMA-aware GC 开关。试试看。另请查看该页面上的其他 GC 调整建议。

这个问题解释了进程状态:在linux中,“top”命令中的所有值是什么意思?.

我认为处理器总结中“wa”和“idle”时间的区别在于,“wa”表示处理器有线程处于“D”状态;即等待磁盘I/O。相比之下,所有线程都在“S”状态等待的处理器将被视为“空闲”。(从这个角度来看,等待锁的线程将处于 S 状态。)

您也可以尝试top -H单独显示线程。

于 2012-10-05T03:35:53.873 回答