我Prometheus
Operator
正在运行Kubernetes
它,我可以监控我的资源和集群。但是我没有收到带有警报触发的电子邮件通知。我应该怎么做才能收到电子邮件?
pod/alertmanager-kube-prometheus-0 2/2 Running 0 72m
pod/kube-prometheus-exporter-kube-state-86b466d978-sp24r 2/2 Running 0 161m
pod/kube-prometheus-exporter-node-2zjc6 1/1 Running 0 162m
pod/kube-prometheus-exporter-node-gwxlg 1/1 Running 0 162m
pod/kube-prometheus-exporter-node-ngc5p 1/1 Running 0 162m
pod/kube-prometheus-exporter-node-vcrw4 1/1 Running 0 162m
pod/kube-prometheus-grafana-6c4dffd84d-mfws7 2/2 Running 0 162m
pod/prometheus-kube-prometheus-0 3/3 Running 1 162m
pod/prometheus-operator-545b59ffc9-tpqs5 1/1 Running 0 163m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-operated ClusterIP None <none> 9093/TCP,6783/TCP 162m
service/kube-prometheus NodePort 10.106.17.176 <none> 9090:31984/TCP 162m
service/kube-prometheus-alertmanager NodePort 10.105.17.59 <none> 9093:30365/TCP 162m
service/kube-prometheus-exporter-kube-state ClusterIP 10.105.149.175 <none> 80/TCP 162m
service/kube-prometheus-exporter-node ClusterIP 10.111.234.174 <none> 9100/TCP 162m
service/kube-prometheus-grafana ClusterIP 10.106.183.201 <none> 80/TCP 162m
service/prometheus-operated ClusterIP None <none> 9090/TCP 162m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/kube-prometheus-exporter-node 4 4 4 4 4 <none> 162m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/kube-prometheus-exporter-kube-state 1/1 1 1 162m
deployment.apps/kube-prometheus-grafana 1/1 1 1 162m
deployment.apps/prometheus-operator 1/1 1 1 163m
NAME DESIRED CURRENT READY AGE
replicaset.apps/kube-prometheus-exporter-kube-state-5858d86974 0 0 0 162m
replicaset.apps/kube-prometheus-exporter-kube-state-86b466d978 1 1 1 161m
replicaset.apps/kube-prometheus-grafana-6c4dffd84d 1 1 1 162m
replicaset.apps/prometheus-operator-545b59ffc9 1 1 1 163m
NAME READY AGE
statefulset.apps/alertmanager-kube-prometheus 1/1 162m
statefulset.apps/prometheus-kube-prometheus 1/1 162m
我把我的AlertManager.yaml
配置保密
kubectl edit secret alertmanager-kube-prometheus -n monitoring
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
alertmanager.yaml: Z2xvYmFsOgogIHNtdHBfc21hcnRob3N0OiAnc210cC5nbWFpbC5jb206NTg3JwogIHNtdHBfZnJvbTogJ3pvay5jbzIyMkBnbWFpbC5jb20nCiAgc210cF9hdXRoX3VzZXJuYW1lOiAnem9rLmNvMjIyQGdtYWlsLmNvbScKICBzbXRwX2F1dGhfcGFzc3dvcmQ6ICdIaWZ0MjIyJwoKdGVtcGxhdGVzOiAKLSAnL2V0Yy9hbGVydG1hbmFnZXIvdGVtcGxhdGUvKi50bXBsJwoKCnJvdXRlOgogIAogIGdyb3VwX2J5OiBbJ2FsZXJ0bmFtZScsICdjbHVzdGVyJywgJ3NlcnZpY2UnICwgJ3NldmVyaXR5J10KCiAgZ3JvdXBfd2FpdDogMzBzCgogIGdyb3VwX2ludGVydmFsOiA1bQoKICByZXBlYXRfaW50ZXJ2YWw6
kind: Secret
metadata:
creationTimestamp: "2019-04-14T07:00:50Z"
labels:
alertmanager: kube-prometheus
app: alertmanager
chart: alertmanager-0.1.7
heritage: Tiller
release: kube-prometheus
name: alertmanager-kube-prometheus
namespace: monitoring
resourceVersion: "598489"
selfLink: /api/v1/namespaces/monitoring/secrets/alertmanager-kube-prometheus
uid: 099ab6d0-ffff-11e9-9f0d-5254001850dc
type: Opaque
我的AlertManager.yaml
样子是这样的:
global:
smtp_smarthost: 'smtp.gmail.com:587'
smtp_from: 'zok@gmail.com'
smtp_auth_username: 'zok@gmail.com'
smtp_auth_password: 'xxxxxx'
templates:
- '/etc/alertmanager/template/*.tmpl'
route:
group_by: ['alertname', 'cluster', 'service' , 'severity']
group_wait: 30s
group_interval: 5m
repeat_interval: 1h
receiver: email-me
routes:
- match_re:
service: ^(foo1|foo2|baz)$
receiver: email-me
routes:
- match:
severity: critical
receiver: email-me
- match:
service: files
receiver: email-me
- match:
severity: warning
receiver: email-me
- match:
service: database
receiver: email-me
group_by: [alertname, cluster, database]
routes:
- match:
owner: team-X
receiver: email-me
continue: true
- match:
severity: warning
receiver: email-me
- match:
severity: front-critical
receiver: email-me
receivers:
- name: 'email-me'
email_configs:
- to: 'meisam@gmail.com'
- name: 'team-Y-mails'
email_configs:
- to: 'meisam@gmail.com
我在仪表板中的警报列表Prometheus
: