java-8 - CMS 类卸载花费了很多时间

Question

在大负载时，我们的应用程序注意到大的 GC 暂停（400 毫秒）。在调查期间，事实证明，与其他阶段（10x-100X）相比，该暂停发生在CMS Final Remark并且class unloading阶段花费了更多时间：

    (CMS Final Remark) 
        [YG occupancy: 142247 K (294912 K)]
2019-03-13T07:38:30.656-0700: 24252.576: 
        [Rescan (parallel) , 0.0216770 secs]
2019-03-13T07:38:30.677-0700: 24252.598: 
        [weak refs processing, 0.0028353 secs]      
2019-03-13T07:38:30.680-0700: 24252.601: 
        [class unloading, 0.3232543 secs]       
2019-03-13T07:38:31.004-0700: 24252.924: 
        [scrub symbol table, 0.0371301 secs]
2019-03-13T07:38:31.041-0700: 24252.961: 
        [scrub string table, 0.0126352 secs]
        [1 CMS-remark: 2062947K(4792320K)] 2205195K(5087232K), 0.3986822 secs]
[Times: user=0.63 sys=0.01, real=0.40 secs]

Total time for which application threads were stopped: 0.4156259 seconds, 
Stopping threads took: 0.0014133 seconds

这种暂停总是发生在性能测试的第一秒，暂停的持续时间从 300ms 到 400+ms 不等。

不幸的是，我无法访问服务器（它正在维护中）并且只有几次测试运行的日志。但是当服务器可用时，我想为进一步调查做好准备，但我不知道是什么导致了这种行为。

我的第一个想法是关于 Linux Huge pages，但我们不使用它们。

在日志中经过更多时间后，我发现以下内容：

Heap after GC invocations=7969 (full 511):
 par new generation   total 294912K, used 23686K [0x0000000687800000, 0x000000069b800000, 0x000000069b800000)
  eden space 262144K,   0% used [0x0000000687800000, 0x0000000687800000, 0x0000000697800000)
  from space 32768K,  72% used [0x0000000699800000, 0x000000069af219b8, 0x000000069b800000)
  to   space 32768K,   0% used [0x0000000697800000, 0x0000000697800000, 0x0000000699800000)
 concurrent mark-sweep generation total 4792320K, used 2062947K [0x000000069b800000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 282286K, capacity 297017K, committed 309256K, reserved 1320960K
  class space    used 33038K, capacity 36852K, committed 38872K, reserved 1048576K
}


Heap after GC invocations=7970 (full 511):
 par new generation   total 294912K, used 27099K [0x0000000687800000, 0x000000069b800000, 0x000000069b800000)
  eden space 262144K,   0% used [0x0000000687800000, 0x0000000687800000, 0x0000000697800000)
  from space 32768K,  82% used [0x0000000697800000, 0x0000000699276df0, 0x0000000699800000)
  to   space 32768K,   0% used [0x0000000699800000, 0x0000000699800000, 0x000000069b800000)
 concurrent mark-sweep generation total 4792320K, used 2066069K [0x000000069b800000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 282303K, capacity 297017K, committed 309256K, reserved 1320960K
  class space    used 33038K, capacity 36852K, committed 38872K, reserved 1048576K
}

调查 GC 暂停发生在 GC 调用 7969 和 7970 之间。元空间中的已用空间量几乎相同（实际上增加了）

所以，看起来它实际上并不是一些不再使用的停顿类（因为没有空间被清除），并且它不是安全点到达问题 - 因为线程阻塞需要很短的时间（0.0014133）。

如何调查这种情况以及适当的准备需要哪些诊断信息。

技术细节

Centos5 + JDK8 + CMS GC 带参数：

-XX:+CMSClassUnloadingEnabled 
-XX:CMSIncrementalDutyCycleMin=10 
-XX:+CMSIncrementalPacing 
-XX:CMSInitiatingOccupancyFraction=50 
-XX:+CMSParallelRemarkEnabled 
-XX:+DisableExplicitGC 
-XX:InitialHeapSize=5242880000 
-XX:MaxHeapSize=5242880000 
-XX:MaxNewSize=335544320 
-XX:MaxTenuringThreshold=6 
-XX:NewSize=335544320 
-XX:OldPLABSize=16
-XX:+UseCompressedClassPointers 
-XX:+UseCompressedOops 
-XX:+UseConcMarkSweepGC 
-XX:+UseParNewGC

java-8 - CMS 类卸载花费了很多时间

0 回答 0

Related

Reference