2

我正在寻找能够在节点加入集群时自动扩展 Pod 并在删除节点时缩回的解决方案。我们在节点上运行 WebApp,这需要在计划断开节点时优雅地驱逐/终止 pod。我正在检查使用 DaemonSet 的选项,但由于我们使用 Kops 进行集群滚动更新,它忽略了 DaemonSets 驱逐(不支持标志“--ignore-daemionset”)。结果,WebApp 与我们的应用程序不可接受的节点一起“死亡”。Horizo​​ntalPodAutoscaler 覆盖部署 yaml 中设置的副本数量的能力可以解决这个问题。我想找到根据集群中节点的数量动态更改 Horizo​​ntalPodAutoscaler yaml 中的 min/maxReplicas 的方法。

spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: MyWebApp
  minReplicas: "Num of nodes in the cluster"
  maxReplicas: "Num of nodes in the cluster"

任何想法如何获取节点数量并相应地更新集群中的 Horizo​​ntalPodAutoscaler yaml?或者任何其他解决问题的方法?

4

1 回答 1

1

您是否尝试过nodeSelector在 daemonset yaml 中使用规范。因此,如果您在 yaml 中设置了节点选择器,并且如果您从节点中删除节点选择器标签值,那么当您将新节点添加到集群标签时,守护程序集也应该优雅地缩减,并且使用自定义值和 deamonset 将扩大。

这对我有用,所以你可以试试这个并与 Kops 确认

第一:使用集群上始终拥有的自定义标签标记所有节点

例子:

kubectl label nodes k8s-master-1 mylabel=allow_demon_set  
kubectl label nodes k8s-node-1 mylabel=allow_demon_set
kubectl label nodes k8s-node-2 mylabel=allow_demon_set
kubectl label nodes k8s-node-3 mylabel=allow_demon_set

然后到你的守护进程集 yaml 添加节点选择器。

Example.yaml 使用如下:注意 添加了 nodeelctor 字段

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-elasticsearch
  labels:
    k8s-app: fluentd-logging
spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
    spec:
      nodeSelector:
        mylabel: allow_demon_set
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd-elasticsearch
        image: quay.io/fluentd_elasticsearch/fluentd:v2.5.2
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

所以节点标记如下

$ kubectl get nodes --show-labels
NAME           STATUS   ROLES    AGE   VERSION   LABELS
k8s-master-1   Ready    master   9d    v1.17.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master-1,kubernetes.io/os=linux,mylable=allow_demon_set,node-role.kubernetes.io/master=
k8s-node-1     Ready    <none>   9d    v1.17.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node-1,kubernetes.io/os=linux,mylable=allow_demon_set
k8s-node-2     Ready    <none>   9d    v1.17.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node-2,kubernetes.io/os=linux,mylable=allow_demon_set
k8s-node-3     Ready    <none>   9d    v1.17.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node-3,kubernetes.io/os=linux,mylable=allow_demon_set

一旦你有正确的 yaml 使用它启动守护程序集

$ kubectl create -f Example.yaml

$ kubectl get all -o wide
NAME                              READY   STATUS    RESTARTS   AGE   IP            NODE           NOMINATED NODE   READINESS GATES
pod/fluentd-elasticsearch-jrgl6   1/1     Running   0          20s   10.244.3.19   k8s-node-3     <none>           <none>
pod/fluentd-elasticsearch-rgcm2   1/1     Running   0          20s   10.244.0.6    k8s-master-1   <none>           <none>
pod/fluentd-elasticsearch-wccr9   1/1     Running   0          20s   10.244.1.14   k8s-node-1     <none>           <none>
pod/fluentd-elasticsearch-wxq5v   1/1     Running   0          20s   10.244.2.33   k8s-node-2     <none>           <none>

NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE   SELECTOR
service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   9d    <none>

NAME                                   DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR             AGE   CONTAINERS              IMAGES                                         SELECTOR
daemonset.apps/fluentd-elasticsearch   4         4         4       4            4           mylable=allow_demon_set   20s   fluentd-elasticsearch   quay.io/fluentd_elasticsearch/fluentd:v2.5.2   name=fluentd-elasticsearch

然后在排空节点之前,我们可以从节点中删除自定义标签,并且 pod 应该优雅地缩小,然后排空节点。

$ kubectl label nodes k8s-node-3 mylabel-

检查守护程序集,它应该缩小

ubuntu@k8s-kube-client:~$ kubectl get all -o wide
NAME                              READY   STATUS        RESTARTS   AGE     IP            NODE           NOMINATED NODE   READINESS GATES
pod/fluentd-elasticsearch-jrgl6   0/1     Terminating   0          2m36s   10.244.3.19   k8s-node-3     <none>           <none>
pod/fluentd-elasticsearch-rgcm2   1/1     Running       0          2m36s   10.244.0.6    k8s-master-1   <none>           <none>
pod/fluentd-elasticsearch-wccr9   1/1     Running       0          2m36s   10.244.1.14   k8s-node-1     <none>           <none>
pod/fluentd-elasticsearch-wxq5v   1/1     Running       0          2m36s   10.244.2.33   k8s-node-2     <none>           <none>

NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE   SELECTOR
service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   9d    <none>

NAME                                   DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR             AGE     CONTAINERS              IMAGES                                         SELECTOR
daemonset.apps/fluentd-elasticsearch   3         3         3       3            3           mylable=allow_demon_set   2m36s   fluentd-elasticsearch   quay.io/fluentd_elasticsearch/fluentd:v2.5.2   name=fluentd-elasticsearch

现在再次将标签添加到具有相同自定义标签的新节点,当它添加到集群时,deamonset 将扩大规模

$ kubectl label nodes k8s-node-3 mylable=allow_demon_set

ubuntu@k8s-kube-client:~$ kubectl get all -o wide
NAME                              READY   STATUS    RESTARTS   AGE     IP            NODE           NOMINATED NODE   READINESS GATES
pod/fluentd-elasticsearch-22rsj   1/1     Running   0          2s      10.244.3.20   k8s-node-3     <none>           <none>
pod/fluentd-elasticsearch-rgcm2   1/1     Running   0          5m28s   10.244.0.6    k8s-master-1   <none>           <none>
pod/fluentd-elasticsearch-wccr9   1/1     Running   0          5m28s   10.244.1.14   k8s-node-1     <none>           <none>
pod/fluentd-elasticsearch-wxq5v   1/1     Running   0          5m28s   10.244.2.33   k8s-node-2     <none>           <none>

NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE   SELECTOR
service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   9d    <none>

NAME                                   DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR             AGE     CONTAINERS              IMAGES                                         SELECTOR
daemonset.apps/fluentd-elasticsearch   4         4         4       4            4           mylable=allow_demon_set   5m28s   fluentd-elasticsearch   quay.io/fluentd_elasticsearch/fluentd:v2.5.2   name=fluentd-elasticsearch

请确认这是否是您想要做的并与 kops 一起使用

于 2020-01-22T13:17:22.150 回答