
alert if metric X has dropped once by 5% in the last 5 minutes.

满足此规则的要求是测量以 1 分钟间隔出现的连续数据点的下降,如果任何数据点的下降大于或等于 5%,我们就会发送警报。


# First group of rules, runs every 1 minute
# Recording rule which measures the percentage drop between consecutive points
((idelta(metricX{job="A"}[2m]) / (metricX{job="A"} offset 1m)) * 100)

# Recording rule which generate a time series of 1 if percent drop is >= X% or 0 otherwise
<insert expression here>

# Second group of rules begins which runs every 5 minutes
# Alert rule which reads and sums the timeseries of 1's and 0's over the last 5 minutes and alerts if sum is greater than 0
sum_over_time(timeseries_1_0[5m]) > 0

第二条录音规则怎么写?我已经尝试过clamp_max/min。但我不认为那是我想要的。对我有帮助的是 promQL 中的 if/else 构造。没有时间序列查询方面的经验也无济于事。对此的任何帮助将不胜感激。


1 回答 1



record: metricX:idelta_ratio
expr: ((idelta(metricX{job="A"}[2m]) / (metricX{job="A"} offset 1m)) * 100)

record: metricX:idelta_ratio_le-5
expr: metricX:idelta_ratio <= bool -5

alert: MetricXDroppedBy5Percent
expr: sum_over_time(metricX:idelta_ratio_le-5[5m]) > 0

但请注意,Prometheus 不保证每分钟准确收集一次您的指标。或者您的规则每分钟只评估一次。并且您正在对规则中的1m2m范围进行硬编码,如果您的抓取间隔发生变化,这可能会以有趣的方式出现错误。

于 2019-05-06T15:27:57.133 回答