1

我已经从 community-helm chart(14.6.0) 部署了 prometheus,它正在运行 alertmanager,它显示不时出现的错误(模板问题),错误消息显示没有任何额外的用处。问题是我已经通过 amtool 重新测试了配置并且在配置中没有收到错误

level=error ts=2021-08-17T14:43:08.787Z caller=dispatch.go:309 component=dispatcher msg="Notify for alerts failed" num_alerts=2 err="opsgenie/opsgenie[0]: notify retry canceled due to unrecoverable error after 1 attempts: unexpected status code 422: {\"message\":\"Request body is not processable. Please check the errors.\",\"errors\":{\"message\":\"Message can not be empty.\"},\"took\":0.0,\"requestId\":\"38c37c18-5635-48bc-bb69-bda03e232cce\"}"
level=debug ts=2021-08-17T14:43:08.798Z caller=notify.go:685 component=dispatcher receiver=opsgenie integration=opsgenie[0] msg="Notify success" attempts=1
level=error ts=2021-08-17T14:43:08.804Z caller=dispatch.go:309 component=dispatcher msg="Notify for alerts failed" num_alerts=2 err="opsgenie/opsgenie[0]: notify retry canceled due to unrecoverable error after 1 attempts: unexpected status code 422: {\"message\":\"Request body is not processable. Please check the errors.\",\"errors\":{\"message\":\"Message can not be empty.\"},\"took\":0.001,\"requestId\":\"70d2ac84-3422-4fe6-9d8b-e601fdc37b25\"}"

监控正在工作并获得警报只是想了解如何翻译此错误.. 启用调试模式没有提供更多信息可能有什么问题。

警报管理器配置:

global: {}
receivers:
- name: opsgenie
  opsgenie_configs:
  - api_key: XXX
    api_url: https://api.eu.opsgenie.com/
    details:
      Prometheus alert: ' {{ .CommonLabels.alertname }}, {{ .CommonLabels.namespace }}, {{ .CommonLabels.pod }}, {{ .CommonLabels.dimension_CacheClusterId }}, {{ .CommonLabels.dimension_DBInstanceIdentifier }}, {{ .CommonLabels.dimension_DBClusterIdentifier }}'
    http_config: {}
    message: '{{ .CommonAnnotations.message }}'
    priority: '{{ if eq .CommonLabels.severity "critical" }}P2{{ else if eq .CommonLabels.severity "high" }}P3{{ else if eq .CommonLabels.severity "warning" }}P4{{ else }}P5{{ end }}'
    send_resolved: true
    tags: ' Prometheus, {{ .CommonLabels.namespace }}, {{ .CommonLabels.severity }}, {{ .CommonLabels.alertname }}, {{ .CommonLabels.pod }}, {{ .CommonLabels.kubernetes_node }}, {{ .CommonLabels.dimension_CacheClusterId }}, {{ .CommonLabels.dimension_DBInstanceIdentifier }}, {{ .CommonLabels.dimension_Cluster_Name }}, {{ .CommonLabels.dimension_DBClusterIdentifier }} '
- name: deadmansswitch
  webhook_configs:
  - http_config:
      basic_auth:
        password: XXX
    send_resolved: true
    url: https://api.eu.opsgenie.com/v2/heartbeats/prometheus-nonprod/ping
- name: blackhole
route:
  group_by:
  - alertname
  - namespace
  - kubernetes_node
  - dimension_CacheClusterId
  - dimension_DBInstanceIdentifier
  - dimension_Cluster_Name
  - dimension_DBClusterIdentifier
  - server_name
  group_interval: 5m
  group_wait: 10s
  receiver: opsgenie
  repeat_interval: 5m
  routes:
  - group_interval: 1m
    match:
      alertname: DeadMansSwitch
    receiver: deadmansswitch
    repeat_interval: 1m
  - match_re:
      namespace: XXX
  - match_re:
      alertname: HighMemoryUsage|HighCPULoad|CPUThrottlingHigh
  - match_re:
      namespace: .+
    receiver: blackhole
  - group_by:
    - instance
    match:
      alertname: PrometheusBlackboxEndpoints
  - match_re:
      alertname: .*
  - match_re:
      kubernetes_node: .*
  - match_re:
      dimension_CacheClusterId: .*
  - match_re:
      dimension_DBInstanceIdentifier: .*
  - match_re:
      dimension_Cluster_Name: .*
  - match_re:
4

0 回答 0