0

我配置了以下所有配置,但是当我键入命令时 request_per_second 没有出现

kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1

在应该监控的node.js中,我安装了prom-client,我测试了/metrics,它工作得很好,指标“resquest_count”是它返回的对象

这是该节点代码的重要部分

(...)
const counter = new client.Counter({
  name: 'request_count',
  help: 'The total number of processed requests'
});
(...)

router.get('/metrics', async (req, res) => {
  res.set('Content-Type', client.register.contentType)
  res.end(await client.register.metrics())
})

这是我的服务监视器配置

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: un1qnx-validation-service-monitor-node
  namespace: default
  labels:
    app: node-request-persistence
    release: prometheus
spec:
  selector:
    matchLabels:
      app: node-request-persistence
  endpoints:
  - interval: 5s
    path: /metrics
    port: "80"
    bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
  namespaceSelector:
    matchNames:
    - un1qnx-aks-development

这是 node-request-persistence 配置

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: node-request-persistence
  namespace: un1qnx-aks-development
  name: node-request-persistence
spec:
  selector:
    matchLabels:
      app: node-request-persistence
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/path: /metrics
        prometheus.io/port: "80"
      labels:
        app: node-request-persistence
    spec:
      containers:
      - name: node-request-persistence
        image: node-request-persistence
        imagePullPolicy: Always # IfNotPresent
        resources:
          requests:
            memory: "200Mi" # Gi
            cpu: "100m"
          limits:
            memory: "400Mi"
            cpu: "500m"
        ports:
        - name: node-port
          containerPort: 80

这是普罗米修斯适配器

prometheus:
  url: http://prometheus-server.default.svc.cluster.local
  port: 9090
rules:
  custom:
  - seriesQuery: 'request_count{namespace!="", pod!=""}'
    resources:
      overrides:
        namespace: {resource: "namespace"}
        pod: {resource: "pod"}
    name:
      as: "request_per_second"
    metricsQuery: "round(avg(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>))"

这是hpa

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: un1qnx-validation-service-hpa-angle
  namespace: un1qnx-aks-development
spec:
  minReplicas: 1
  maxReplicas: 10
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: un1qnx-validation-service-angle
  metrics:
  - type: Pods
    pods:
      metric:
        name: request_per_second
      target:
        type: AverageValue
        averageValue: "5"

命令

kubectl get hpa -n un1qnx-aks-development

导致“未知/5”

另外,命令

kubectl get --raw "http://prometheus-server.default.svc.cluster.local:9090/api/v1/series"

结果是

来自服务器的错误(NotFound):服务器找不到请求的资源

我认为它应该返回一些关于收集的指标的值......我认为问题出在服务监视器上,但我对此并不陌生

正如您所注意到的,我正在尝试基于另一个部署 pod 扩展部署,不知道那里是否存在问题

我很感激一个答案,因为这是我的论文

Kubernetes - 版本 1.19.9

Prometheus - 图表 prometheus-14.2.1 应用程序版本 2.26.0

Prometheus Adapter - chart 2.14.2 app version 0.8.4

以及所有使用 helm 安装的地方

4

1 回答 1

0

一段时间后,我发现了问题,并更改了以下内容

更改了 prometheus 适配器上的端口、查询时间和资源覆盖的名称。但是要知道资源覆盖的名称,您需要转发到 prometheus 服务器并检查您正在监控的应用程序的目标页面上的标签。

prometheus:
  url: http://prometheus-server.default.svc.cluster.local
  port: 80
rules:
  custom:
  - seriesQuery: 'request_count{kubernetes_namespace!="", kubernetes_pod_name!=""}'
    resources:
      overrides:
        kubernetes_namespace: {resource: "namespace"}
        kubernetes_pod_name: {resource: "pod"}
    name:
      matches: "request_count"
      as: "request_count"
    metricsQuery: "round(avg(rate(<<.Series>>{<<.LabelMatchers>>}[5m])) by (<<.GroupBy>>))"

我还在部署 yaml 上添加了注释

spec:
  selector:
    matchLabels:
      app: node-request-persistence
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/path: /metrics
        prometheus.io/port: "80"
      labels:
于 2021-06-21T20:49:56.950 回答