在阅读了它之后,我决定尝试为 cassandra 启用大页面。然而,在设置了一些东西之后,cassandra 根本没有启动。我怀疑我错过了一些实际允许 cassandra 使用大页面所必需的重要设置,但我不确定是什么。
具体来说,我做了:
echo 112 > /proc/sys/vm/hugetlb_shm_group
echo 2048 > /proc/sys/vm/nr_hugepages
mount -t hugetlbfs hugetlbfs /dev/hugepages
112 是 cassandra 组 ID,并在 cassandra-env.sh 中添加了以下内容:
JVM_OPTS="$JVM_OPTS -XX:+UseLargePages -XX:LargePageSizeInBytes=2m -XX:+AlwaysPreTouch"
我在 debian 拉伸上使用 cassandra 3.11.3。
我正在阅读以下文档:
- https://tobert.github.io/tldr/cassandra-java-huge-pages.html
- https://docs.openstack.org/nova/rocky/admin/huge-pages.html
- https://blog.pythian.com/performance-tuning-hugepages-in-linux/
- https://gist.github.com/tobert/c13803b905069b82e1fd
注意:虽然 /var/log/cassandra 中没有任何内容,但我在 /var/lib/cassandra/hs_err_1557985546.log 中发现了以下内容:
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 3992977408 bytes for committing reserved memory.
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (os_linux.cpp:2643), pid=7022, tid=0x00007f10e1e02700
#
# JRE version: (8.0_181-b13) (build )
# Java VM: OpenJDK 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
.... some stack trace with just numbers ...
/proc/meminfo:
MemTotal: 7657756 kB
MemFree: 2251340 kB
MemAvailable: 2198368 kB
Buffers: 33656 kB
Cached: 102892 kB
SwapCached: 0 kB
Active: 1066784 kB
Inactive: 52664 kB
Active(anon): 983092 kB
Inactive(anon): 316 kB
Active(file): 83692 kB
Inactive(file): 52348 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 19108 kB
Writeback: 0 kB
AnonPages: 982900 kB
Mapped: 26556 kB
Shmem: 440 kB
Slab: 40420 kB
SReclaimable: 21728 kB
SUnreclaim: 18692 kB
KernelStack: 2976 kB
PageTables: 5312 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 1731724 kB
Committed_AS: 1227816 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
HugePages_Total: 2048
HugePages_Free: 2047
HugePages_Rsvd: 119
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 71680 kB
DirectMap2M: 7919616 kB
因此,问题似乎在于它没有使用 HugePages 并且没有足够的非巨大内存。但为什么?