我正在使用 Prometheus Blackbox-exporter ICMP 设置监控 N 个系统的 UP/Down 状态。
黑盒导出器配置:
modules:
icmp:
prober: icmp
timeout: 5s
icmp:
preferred_ip_protocol: "ip4"
普罗米修斯配置:
global:
scrape_interval: 15s
external_labels:
monitor: 'codelab-monitor'
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['prometheus:9090']
- job_name: 'blackbox'
metrics_path: /probe
params:
module: [icmp]
static_configs:
- targets: ['192.168.1.29', '987.234.121.1']
labels:
group: 'Build'
- targets: ['161.92.248.21', '161.92.3.185', '10.10.4.18']
labels:
group: 'RND'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: blackboxexporter:9115
blackbox-exporter 探测结果准确且看起来不错注意:结果显示无法访问的目标失败,这看起来不错
Recent Probes
Module Target Result Debug
icmp 192.168.1.29 Failure Logs
icmp 192.168.3.185 Failure Logs
icmp 161.92.248.21 Success Logs
icmp 192.168.4.185 Failure Logs
icmp 987.234.121.1 Failure Logs
icmp 192.168.1.29 Failure Logs
icmp 192.168.3.185 Failure Logs
icmp 161.92.248.21 Success Logs
Prometheus 结果不准确。这显示所有目标都是 UP 注意:预期结果是失败目标应该显示为 0/1
blackbox (5/5 up)
Endpoint State Labels Last Scrape Scrape Duration Error
http://blackboxexporter:9115/probe
module="icmp" target="161.92.248.21" UP group="RND" instance="161.92.248.21" job="blackbox" 1.43s ago 1.522ms
http://blackboxexporter:9115/probe
module="icmp" target="192.168.1.29" UP group="Build" instance="192.168.1.29" job="blackbox" 5.548s ago 1.501s
http://blackboxexporter:9115/probe
module="icmp" target="192.168.3.185" UP group="RND" instance="192.168.3.185" job="blackbox" 1.944s ago 1.501s
http://blackboxexporter:9115/probe
module="icmp" target="192.168.4.185" UP group="RND" instance="192.168.4.185" job="blackbox" 3.09s ago 1.501s
http://blackboxexporter:9115/probe
module="icmp" target="987.234.121.1" UP group="Build" instance="987.234.121.1" job="blackbox" 2.796s ago 1.506ms