0

我是 Kubernetes 的新手。我想从 kubernetes 集群中存在的所有 cAdvisor pod 获取容器指标。我已将 cAdvisor 部署为集群上的守护程序集。下面是yaml文件。

我只能看到 cAdvisor 部署在一个节点池的每个节点上。不在新创建的。如何做到这一点?

用于 kubernetes 的cAdvisor daemonset yaml文件:

apiVersion: apps/v1 # for versions before 1.8.0 use extensions/v1beta1 [apps/v1beta2]
kind: DaemonSet
metadata:
  name: cadvisor
  namespace: kube-system
  labels:
    app: cadvisor
spec:
  selector:
    matchLabels:
      name: cadvisor
  template:
    metadata:
      labels:
        name: cadvisor
    spec:
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: cadvisor
        image: google/cadvisor:latest
        volumeMounts:
        - name: rootfs
          mountPath: /rootfs
          readOnly: true
        - name: var-run
          mountPath: /var/run
          readOnly: false
        - name: sys
          mountPath: /sys
          readOnly: true
        - name: docker
          mountPath: /var/lib/docker
          readOnly: true
        ports:
          - name: http
            containerPort: 8080
            protocol: TCP
        args:
          - --housekeeping_interval=10s
      terminationGracePeriodSeconds: 30
      volumes:
      - name: rootfs
        hostPath:
          path: /
      - name: var-run
        hostPath:
          path: /var/run
      - name: sys
        hostPath:
          path: /sys
      - name: docker
        hostPath:
          path: /var/lib/docker

kind: Service
apiVersion: v1
metadata:
  name: cadvisor
  namespace: kube-system
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local
  selector:
    name: cadvisor
  ports:
  - name: http
    protocol: TCP
    port: 80
    targetPort: 8080

prometheus.yml 文件:


# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']

  - job_name: 'cadvisor_k8s_containers'
    scrape_interval:     15s
    static_configs:
            - targets: [IP-Address]



更新:我可以看到仅从其中一个节点池 pod 中抓取的容器指标。如何也从新添加的节点池中获取容器指标?

4

0 回答 0