问题:如果时间序列超过阈值,则不会创建任何事件。
如果 5% 的请求在 CloudRun 中返回 4xx,我想收到警报。我使用以下查询创建了一个警报策略:
fetch cloud_run_revision::run.googleapis.com/request_count
| { filter metric.response_code_class = '4xx'
; ident }
| group_by [resource.service_name], 1m, max(val())
| ratio
| condition val() > 0.05 '10^2.%'
在云控制台中,我可以看到实际上有超过阈值的时间序列:
期望是,事件被创建。然而,这种情况并非如此。
为了完整起见:我使用 terraform 创建了警报:
resource "google_monitoring_alert_policy" "cloudrun_http_4xx_errors" {
display_name = "CloudRun 4xx errors"
documentation {
content = "CloudRun returned 4xx for more than 5% of its requests."
}
combiner = "OR"
notification_channels = var.environment == "dev" ? [] : [
google_monitoring_notification_channel.pubsubchannel.name]
conditions {
display_name = "4xx errors"
condition_monitoring_query_language {
query = <<EOT
fetch cloud_run_revision::run.googleapis.com/request_count
| { filter metric.response_code_class = '4xx'
; ident }
| group_by [resource.service_name], 1m, max(val())
| ratio
| condition val() > 0.05 '10^2.%'
EOT
duration = "60s"
}
}
}