kubernetes - 避免 kubernetes 调度程序在 kubernetes 集群的单个节点中运行所有 pod

Question

我有一个具有 4 个节点和一个主节点的 Kubernetes 集群。我正在尝试在所有节点中运行 5 个 nginx pod。目前，调度程序有时在一台机器上运行所有 pod，有时在不同的机器上运行。

如果我的节点出现故障并且我的所有 pod 都在同一个节点上运行，会发生什么？我们需要避免这种情况。

如何强制调度程序以循环方式在节点上运行 pod，以便如果任何节点出现故障，那么至少一个节点应该让 NGINX pod 处于运行模式。

这可能吗？如果可能，我们如何实现这种情况？

score 24 · Accepted Answer

使用 podAntiAfinity

带有requiredDuringSchedulingIgnoredDuringExecution的 podAntiAfinity可用于防止同一个 pod 被调度到同一个主机名。如果更喜欢更宽松的约束，请使用preferredDuringSchedulingIgnoredDuringExecution。

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 5
  template:
    metadata:
      labels:                                            
        app: nginx                                   
    spec:
      affinity:
        podAntiAffinity:                                 
          requiredDuringSchedulingIgnoredDuringExecution:   <---- hard requirement not to schedule "nginx" pod if already one scheduled.
          - topologyKey: kubernetes.io/hostname     <---- Anti affinity scope is host     
            labelSelector:                               
              matchLabels:                               
                app: nginx        
      container:
        image: nginx:latest

Kubelet --max-pods

您可以在 kubelet 配置中指定节点的最大 pod 数，以便在节点宕机的情况下，它会防止 K8S 用来自故障节点的 pod 饱和其他节点。

score 11 · Accepted Answer

我认为 Pod 间反亲和性功能会对您有所帮助。pod 间反亲和性允许您根据节点上已经运行的 pod 上的标签来限制您的 pod 有资格在哪些节点上调度。这是一个例子。

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    run: nginx-service
  name: nginx-service
spec:
  replicas: 3
  selector:
    matchLabels:
      run: nginx-service
  template:
    metadata:
      labels:
        service-type: nginx
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: service-type
                operator: In
                values:
                - nginx
            topologyKey: kubernetes.io/hostname
      containers:
      - name: nginx-service
        image: nginx:latest

注意：我在这里使用preferredDuringSchedulingIgnoredDuringExecution因为你的 pod 比节点多。

更详细的信息，您可以参考以下链接的 Pod 间亲和和反亲和（beta 功能）部分： https ://kubernetes.io/docs/concepts/configuration/assign-pod-node/

score 7 · Accepted Answer

使用 Pod 拓扑传播约束

从 2021 年开始，（v1.19 及更高版本）您可以默认使用Pod 拓扑传播约束 topologySpreadConstraints，我发现它比podAntiAfinity这种情况更适合。

主要区别在于 Anti-affinity 只能限制每个节点一个 pod，而 Pod 拓扑传播约束可以限制每个节点 N 个 pod。

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-example-deployment
spec:
  replicas: 6
  selector:
    matchLabels:
      app: nginx-example
  template:
    metadata:
      labels:
        app: nginx-example
    spec:
      containers:
      - name: nginx
        image: nginx:latest
      # This sets how evenly spread the pods
      # For example, if there are 3 nodes available,
      # 2 pods are scheduled for each node.
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: nginx-example

有关更多详细信息，请参阅KEP-895和官方博客文章。

score 0 · Accepted Answer

我们可以使用 Taint 或 toleration 来避免将 pod 部署到节点中或不部署到节点中。


Tolerations are applied to pods, and allow (but do not require) the pods to schedule onto nodes with matching taints.

Taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes. One or more taints are applied to a node; this marks that the node should not accept any pods that do not tolerate the taints.

示例部署 yaml 将类似于

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    run: nginx-service
  name: nginx-service
spec:
  replicas: 3
  selector:
    matchLabels:
      run: nginx-service
  template:
    metadata:
      labels:
        service-type: nginx
    spec:
      containers:
      - name: nginx-service
        image: nginx:latest
      tolerations:
      - key: "key1"
        operator: "Equal"
        value: "value1"
        effect: "NoSchedule"

您可以在https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/#:~:text=Node%20affinity%2C%20is%20a%20property,onto%20nodes%找到更多信息20with%20matching%20taints。

score 0 · Accepted Answer

如果您的容器为所需的内存和 CPU 量指定资源请求，调度程序应该分散您的 pod。请参阅 http://kubernetes.io/docs/user-guide/compute-resources/

kubernetes - 避免 kubernetes 调度程序在 kubernetes 集群的单个节点中运行所有 pod

5 回答 5

使用 podAntiAfinity

Kubelet --max-pods

使用 Pod 拓扑传播约束

Related

Reference