0

在 Debian Wheezy 更新的服务器上,我正在使用以下软件包的反向端口:

  • Nagios : nagios3 (3.4.1-5~bpo7+1)
  • 穆宁:穆宁(2.0.25-1~bpo70+1)

和 nsca (2.9.1-2) 将数据从 Munin 传输到 Nagios 以处理警报。

Nagios 与以下配置的 Munin 服务一起工作正常:

# generic service template definition
define service{
    name                           generic-munin-service ; The 'name' of this service template
    use generic-service
    check_command                  return-unknown!"No Data from passive check"
    active_checks_enabled          0       ; Active service checks are disabled
    passive_checks_enabled          1
    parallelize_check               1
    notifications_enabled           1
    event_handler_enabled           1
    is_volatile                     1
    notification_interval           120
    notification_period             24x7
    notification_options            w,u,c,r
    check_freshness                1
    freshness_threshold            360
    flap_detection_options         n
    max_check_attempts             2
    register                       0   ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
    ;first_notification_delay      6    ; Delay first notification for false positives (will execute 2 checks : munin sends 1 check every 5 minutes)
}


define service {
    hostgroup_name                munin
    service_description           Disk latency per device :: Average latency for /dev/sda
    use generic-munin-service
    notification_interval         0 ; set > 0 if you want to be renotified
}

define service {
    service_description           Disk latency per device :: Average latency for /dev/sdb
    use generic-munin-service
    notification_interval         0 ; set > 0 if you want to be renotified
}

define service {
    hostgroup_name                munin
    service_description           Disk usage in percent
    use generic-munin-service
    notification_interval         0 ; set > 0 if you want to be renotified
}

define service {
    hostgroup_name                munin
    service_description           Inode usage in percent
    use generic-munin-service
    notification_interval         0 ; set > 0 if you want to be renotified
}

define service {
    hostgroup_name                munin
    service_description           File table usage
    use generic-munin-service
    notification_interval         0 ; set > 0 if you want to be renotified
}

但是,当我添加在所有受监控主机上也可用的其他服务时,它们将在 Nagios中标记为UNKNOWN :

define service {
    hostgroup_name                 munin
    service_description            Memory usage
    use generic-munin-service
    notification_interval          0 ; set > 0 if you want to be renotified
}

define service {
    hostgroup_name                 munin
    service_description            CPU usage
    use generic-munin-service
    notification_interval          0 ; set > 0 if you want to be renotified
}

我已经发现,根据 munin 插件图形标题格式,Nagios 可能无法理解传入的数据,这就是为什么我将服务器上的包更新为 Wheezy 的 backports 版本,因为 Munin 2.0.7 应该清除所有标题。

我还尝试使用更高的调试级别进行调试,日志显示:

[1434122043] SERVICE ALERT: HostIJZI4;Memory usage;UNKNOWN;HARD;2;INCONNU

但我可能需要你的帮助才能走得更远。

4

1 回答 1

0

我建议你更新你的包,Nagios Core 目前是 4.1.1 并且你使用的是旧版本。

他们修复了很多东西,也许你的问题现在已经修复了:https ://www.nagios.org/projects/nagios-core/history/4x/

于 2015-09-10T08:34:36.157 回答