java - NUMA 架构如何影响 ActivePivot 的性能？

Question

我们正在将 ActivePivot 应用程序迁移到新服务器（4 插槽 Intel Xeon，512GB 内存）。部署后，我们启动了我们的应用程序基准测试（这是大型 OLAP 查询与实时事务并发的混合）。测得的性能几乎比我们以前的服务器慢两倍，以前的服务器具有相似的处理器，但内核数量减少了两倍，内存减少了两倍。

我们调查了两台服务器之间的差异，看起来大的一台具有NUMA 架构（非统一内存访问）。每个 CPU 插槽在物理上接近内存的 1/4，但距离其余部分更远……运行我们的应用程序的 JVM 分配了一个大的全局堆，每个 NUMA 节点上都有该堆的随机部分。我们的分析是内存访问模式非常随机，CPU 内核经常浪费时间访问远程内存。

我们正在寻找有关在 NUMA 服务器上利用 ActivePivot 的更多反馈。我们可以配置 ActivePivot 多维数据集或线程池、更改查询、配置操作系统吗？

score 15 · Accepted Answer

Peter described the general JVM options available today to reduce the performance impact of NUMA architectures. To keep it short a NUMA aware JVM will partition the heap with respect to the NUMA nodes, and when a thread creates a new object, the object is allocated in the NUMA node of the core that runs that thread (if the same thread later uses it, the object will be in the local memory). Also when compacting the heap the NUMA aware JVM avoids moving large data chunks between nodes (and reduces the length of stop-the-world events).

So on any NUMA hardware and for any Java application the -XX:+UseNUMA option should probably be enabled.

But for ActivePivot that does not help much: ActivePivot is an in-memory database. There are real-time updates but the bulk of the data resides in the main memory for the life of the application. Whatever the JVM options, the data will be split among NUMA nodes, and the threads that execute queries will access memory randomly. Knowing that most sections of the ActivePivot query engine run as fast as memory can be fetched, the NUMA impact is particularly visible.

So how can you get the most from your ActivePivot solution on a NUMA hardware?

There is an easy solution when the ActivePivot application only uses a fraction of the resources (we find that it is often the case when several ActivePivot solutions run on the same server). For instance an ActivePivot solution that only uses 16 cores out of 64, and 256GB out of a TeraByte. In that case you can restrict the JVM process itself to a NUMA node.

On Linux you prefix the JVM launch with the following option ( http://linux.die.net/man/8/numactl ):

numactl --cpunodebind=xxx

If the entire server is dedicated to one ActivePivot solution, you can leverage the ActivePivot Distributed Architecture to partition the data. If there are 4 NUMA nodes, you start 4 JVMs hosting 4 ActivePivot nodes, each one bound to its NUMA node. With this deployment queries are distributed among the nodes, and each node will perform its share of the work at max performance, within the right NUMA node.

score 8 · Accepted Answer

您可以尝试使用-XX:+UseNUMA

http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html

如果这没有产生您期望的结果，您可能不得不使用taskset将 JVM 锁定到特定套接字，并有效地将服务器分成四台机器，每台机器都有一个 JVM。

我观察到具有更多套接字的机器对其内存（甚至是本地内存）的访问速度较慢，并且始终为您提供所需的性能提升。

java - NUMA 架构如何影响 ActivePivot 的性能？

2 回答 2

Related

Reference