3

我正在尝试在 Kubernetes 上的 Prometheus 中创建警报并将它们发送到 Slack 频道。为此,我正在使用prometheus-community helm-charts(其中已经包含 alertmanager)。因为我想使用自己的警报,所以我还创建了一个values.yml(如下所示),从这里受到强烈启发. 如果我转发 Prometheus,我可以看到我的警报从非活动状态变为待处理状态再到触发状态,但没有消息发送到 slack。我非常有信心我的 alertmanager 配置很好(因为我已经用另一个图表的一些预构建警报对其进行了测试,并且它们被发送到 slack)。所以我最好的猜测是我以错误的方式添加警报(在 serverFiles 部分),但我无法弄清楚如何正确地做到这一点。此外,alertmanager 日志对我来说看起来很正常。有谁知道我的问题来自哪里?

---
serverFiles:
  alerting_rules.yml: 
    groups:
    - name: example
      rules:
      - alert: HighRequestLatency
        expr: sum(rate(container_network_receive_bytes_total{namespace="kube-logging"}[5m]))>20000
        for: 1m
        labels:
          severity: page
        annotations:
          summary: High request latency

alertmanager:
  persistentVolume:
    storageClass: default-hdd-retain
  ## Deploy alertmanager
  ##
  enabled: true

  ## Service account for Alertmanager to use.
  ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/
  ##
  serviceAccount:
    create: true
    name: ""

  ## Configure pod disruption budgets for Alertmanager
  ## ref: https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget
  ## This configuration is immutable once created and will require the PDB to be deleted to be changed
  ## https://github.com/kubernetes/kubernetes/issues/45398
  ##
  podDisruptionBudget:
    enabled: false
    minAvailable: 1
    maxUnavailable: ""

  ## Alertmanager configuration directives
  ## ref: https://prometheus.io/docs/alerting/configuration/#configuration-file
  ##      https://prometheus.io/webtools/alerting/routing-tree-editor/
  ##
  config:
    global:
      resolve_timeout: 5m
      slack_api_url: "I changed this url for the stack overflow question"
    route:
      group_by: ['job']
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 12h
      #receiver: 'slack'
      routes:
      - match:
          alertname: DeadMansSwitch
        receiver: 'null'
      - match:
        receiver: 'slack'
        continue: true
    receivers:
    - name: 'null'
    - name: 'slack'
      slack_configs:
      - channel: 'alerts'
        send_resolved: false
        title: '[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] Monitoring Event Notification'
        text: >-
          {{ range .Alerts }}
            *Alert:* {{ .Annotations.summary }} - `{{ .Labels.severity }}`
            *Description:* {{ .Annotations.description }}
            *Graph:* <{{ .GeneratorURL }}|:chart_with_upwards_trend:> *Runbook:* <{{ .Annotations.runbook }}|:spiral_note_pad:>
            *Details:*
            {{ range .Labels.SortedPairs }} • *{{ .Name }}:* `{{ .Value }}`
            {{ end }}
          {{ end }}

4

1 回答 1

4

所以我终于解决了这个问题。问题显然是kube-prometheus-stackprometheus helm 图表的工作方式有点不同。因此,我不得不在 alertmanagerFiles.alertmanager.yml 中插入代码(从全局开始),而不是 alertmanager.config。

于 2020-12-07T09:40:50.580 回答