我正在尝试在 prometheus-operator 中监视外部服务(它是 cassandra 指标的导出器)。我使用 helm 2.11.0 安装了 prometheus-operator。我使用这个 yaml 安装了它:
apiVersion: v1
kind: ServiceAccount
metadata:
name: tiller
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: tiller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: tiller
namespace: kube-system
以及我的 kubernetes 集群上的这些命令:
kubectl create -f rbac-config.yml
helm init --service-account tiller --history-max 200
helm install stable/prometheus-operator --name prometheus-operator --namespace monitoring
接下来,根据文章: 如何监控到外部服务
我尝试执行其中描述的步骤。正如建议的那样,我为现有的 Prometheus 创建了带有标签的 Endpoints、Service 和 ServiceMonitor。这是我的 yaml 文件:
apiVersion: v1
kind: Endpoints
metadata:
name: cassandra-metrics80
labels:
app: cassandra-metrics80
subsets:
- addresses:
- ip: 10.150.1.80
ports:
- name: web
port: 7070
protocol: TCP
apiVersion: v1
kind: Service
metadata:
name: cassandra-metrics80
namespace: monitoring
labels:
app: cassandra-metrics80
release: prometheus-operator
spec:
externalName: 10.150.1.80
ports:
- name: web
port: 7070
protocol: TCP
targetPort: 7070
type: ExternalName
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: cassandra-metrics80
labels:
app: cassandra-metrics80
release: prometheus-operator
spec:
selector:
matchLabels:
app: cassandra-metrics80
release: prometheus-operator
namespaceSelector:
matchNames:
- monitoring
endpoints:
- port: web
interval: 10s
honorLabels: true
该服务未激活且所有标签均已删除。我做了很多事情试图解决这个问题,比如设置 targetLabels。尝试重新标记曾经发现的内容,如此处所述:prometheus relabeling 但不幸的是,没有任何效果。可能是什么问题,或者我怎样才能更好地调查它?