我有带有 3 个节点“VM”的 K8s 集群,在通过 kubespray“基于 kubeadm 的工具”安装的所有 3 个节点“未污染的主节点”上安装了 Etcd 进行主/工作
现在我想用另一个替换一个虚拟机。是否有直接的方法可以这样做,因为唯一的解决方法是通过 kubespray scale.yml ex:node4 和节点 5 添加 2 个节点以始终拥有奇数个 ETCD,然后删除额外的节点 3 和节点 5 保留节点 4
我不喜欢这种方法。
欢迎任何想法
此致
我有带有 3 个节点“VM”的 K8s 集群,在通过 kubespray“基于 kubeadm 的工具”安装的所有 3 个节点“未污染的主节点”上安装了 Etcd 进行主/工作
现在我想用另一个替换一个虚拟机。是否有直接的方法可以这样做,因为唯一的解决方法是通过 kubespray scale.yml ex:node4 和节点 5 添加 2 个节点以始终拥有奇数个 ETCD,然后删除额外的节点 3 和节点 5 保留节点 4
我不喜欢这种方法。
欢迎任何想法
此致
If you have 3 main (please avoid using master, ) control plane nodes you should be fine replacing 1 at a time. The only thing is that your cluster will not be able to make any decisions/schedule any new workload, but the existing workloads will run fine.
The recommendation of 5 main nodes is based on the fact that you will always have a majority to reach quorum on the state decisions for etcd even if one node goes down. So if you have 5 nodes and one of them goes down you will still be able to schedule/run workloads.
In other words:
3 main nodes
5 main nodes
To summarize, Raft which is the consensus protocol for etcd tolerates up to (N-1)/2 failures and requires a majority or quorum of (N/2)+1. The recommended procedure is to update the nodes one at a time: bring one node down, and then a bring another one up, and wait for it to join the cluster (all control plane components)