最近几天我开始看到这个问题。Ganglia gemtad 进程在 SIGSEGV 启动后 5 分钟内终止(segfault)
自过去几个月以来,这一直很稳定。所以不确定发生了什么变化。
Version - gmetad 3.7.1
我在/var/log/messages或/var/log/secure 中也没有看到任何核心转储或任何特定于 gmetad 的内容。
此事件发生时的系统快照(从顶部开始)
load average: 1.97, 0.99, 0.42
内存看起来还不错
free -m
total used free shared buffers cached
Mem: 7989 3624 4364 0 333 2562
-/+ buffers/cache: 728 7260
Swap: 4095 0 4095
我有一个分叉和监视 gmetad 的超级进程 -
这是主管日志
2016-10-20 14:34:55,707 INFO exited: gmetad (terminated by SIGSEGV; not expected)
2016-10-20 14:34:55,707 INFO received SIGCLD indicating a child quit
2016-10-20 14:34:57,712 INFO spawned: 'gmetad' with pid 24561
2016-10-20 14:34:59,929 INFO exited: gmetad (terminated by SIGSEGV; not expected)
2016-10-20 14:34:59,929 INFO received SIGCLD indicating a child quit
2016-10-20 14:35:02,932 INFO spawned: 'gmetad' with pid 24593
2016-10-20 14:35:04,897 INFO exited: gmetad (terminated by SIGSEGV; not expected)
2016-10-20 14:35:04,897 INFO received SIGCLD indicating a child quit
2016-10-20 14:35:08,903 INFO spawned: 'gmetad' with pid 24618
2016-10-20 14:35:11,257 INFO exited: gmetad (terminated by SIGSEGV; not expected)
2016-10-20 14:35:11,257 INFO received SIGCLD indicating a child quit
2016-10-20 14:35:12,257 INFO gave up: gmetad entered FATAL state, too many start retries too quickly
有没有人特别遇到过 gmetad 的这种问题?感谢任何指针。