elastalert - CPU 使用率的 Elastalert 规则（百分比）

Question

我面临 CPU 使用率的 elastalert 规则问题（不是平均负载）。我没有受到任何打击和比赛。下面是我的 CPU 规则的 .yaml 文件：

name: CPU usgae
type: metric_aggregation
index: metricbeat-*
buffer_time:
  minutes: 10
metric_agg_key: system.cpu.total.pct
metric_agg_type: avg
query_key: beat.hostname
doc_type: doc
bucket_interval:
  minutes: 5
sync_bucket_interval: true
max_threshold: 60.0
filter:
- term:
    metricset.name: cpu
alert:
- "email"
email:
- "xyz@xy.com"

你能帮我在我的规则中做些什么改变吗？

任何帮助将不胜感激。

谢谢。

score 3 · Accepted Answer

Metricbeat 报告的 CPU 值介于 0 到 1 之间。因此永远不会匹配 60 的阈值。

尝试使用 max_threshold: 0.6 它可能会起作用。

score 0 · Accepted Answer

调试elastalert问题的最佳方法是使用--es_debug_trace这样的命令行选项 ( --es_debug_trace /tmp/output.txt)。它显示了在后台使用的确切curlapi 调用。然后可以将查询复制并在 Kibana 的开发工具中使用，以便于分析和摆弄。elasticsearchelastalert

最有可能的是，doc_type: doc设置可能导致 ES 端点看起来像这样： metricbeat-*/ doc /_search 您可能没有该doc文档，因此不匹配。请删除 doc_type 并尝试。

另请注意，pct 值小于 1，因此对于您的情况：max_threshold: 0.6 对我来说以下作品，供您参考：

name: CPU usage

type: metric_aggregation

use_strftime_index: true
index: metricbeat-system.cpu-%Y.%m.%d

buffer_time:
  hour: 1

metric_agg_key: system.cpu.total.pct
metric_agg_type: avg
query_key: beat.hostname

min_doc_count: 1
  
bucket_interval:
  minutes: 5

max_threshold: 0.6

filter:
- term:
    metricset.name: cpu

realert:
  hours: 2
...

样本匹配输出：

{
'@timestamp': '2021-08-19T15:06:22Z',
 'beat.hostname': 'MY_BUSY_SERVER',
 'metric_system.cpu.total.pct_avg': 0.6155,
 'num_hits': 50,
 'num_matches': 10
}

score 0 · Accepted Answer

0

尝试减少 buffer_time 和 bucket_interval 进行测试

于 2018-12-19T06:30:32.317 回答

elastalert - CPU 使用率的 Elastalert 规则（百分比）

3 回答 3

Related

Reference