我按照以下指南使用 kubeadm 创建 HA 集群: https ://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/ https://medium.com/faun/configuring-ha- kubernetes-cluster-on-bare-metal-servers-with-kubeadm-1-2-1e79f0f7857b
我已经启动并运行了 ETCD 节点,APIserver 通过 HAproxy 和 keepalive 运行。和 1 个使用 weave-net 网络运行的主节点。
我使用这个子网
networking:
podSubnet: 192.168.240.0/22
serviceSubnet: 192.168.244.0/22
但是当我将第二个主节点加入集群时,创建的 weave pod 得到了 CrashLoopBackOff。
我用这一行运行 weave-net 插件:
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')&env.IPALLOC_RANGE=192.168.240.0/21"
我还发现 /etc/cni/net.d 在应用 weave conf 时不是由 kubelet 创建的。
主节点
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kubemaster01 Ready master 17h v1.18.1 192.168.129.137 <none> Ubuntu 18.04.4 LTS 4.15.0-96-generic docker://19.3.8
kubemaster02 Ready master 83m v1.18.1 192.168.129.138 <none> Ubuntu 18.04.4 LTS 4.15.0-91-generic docker://19.3.8
豆荚
oot@kubemaster01:~# kubectl get pods,svc --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system pod/coredns-66bff467f8-kh4mh 0/1 Running 0 18h 192.168.240.3 kubemaster01 <none> <none>
kube-system pod/coredns-66bff467f8-xhzjk 0/1 Running 0 18h 192.168.240.2 kubemaster01 <none> <none>
kube-system pod/kube-apiserver-kubemaster01 1/1 Running 0 16h 192.168.129.137 kubemaster01 <none> <none>
kube-system pod/kube-apiserver-kubemaster02 1/1 Running 0 104m 192.168.129.138 kubemaster02 <none> <none>
kube-system pod/kube-controller-manager-kubemaster01 1/1 Running 0 16h 192.168.129.137 kubemaster01 <none> <none>
kube-system pod/kube-controller-manager-kubemaster02 1/1 Running 0 104m 192.168.129.138 kubemaster02 <none> <none>
kube-system pod/kube-proxy-sct5x 1/1 Running 0 18h 192.168.129.137 kubemaster01 <none> <none>
kube-system pod/kube-proxy-tsr65 1/1 Running 0 104m 192.168.129.138 kubemaster02 <none> <none>
kube-system pod/kube-scheduler-kubemaster01 1/1 Running 2 18h 192.168.129.137 kubemaster01 <none> <none>
kube-system pod/kube-scheduler-kubemaster02 1/1 Running 0 104m 192.168.129.138 kubemaster02 <none> <none>
kube-system pod/weave-net-4zdg6 2/2 Running 0 3h 192.168.129.137 kubemaster01 <none> <none>
kube-system pod/weave-net-bf8mq 1/2 CrashLoopBackOff 38 104m 192.168.129.138 kubemaster02 <none> <none>
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default service/kubernetes ClusterIP 192.168.244.1 <none> 443/TCP 20h <none>
kube-system service/kube-dns ClusterIP 192.168.244.10 <none> 53/UDP,53/TCP,9153/TCP 18h k8s-app=kube-dns
主节点中的 IP 路由
root@kubemaster01:~# ip r
default via 192.168.128.1 dev ens3 proto static
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.128.0/21 dev ens3 proto kernel scope link src 192.168.129.137
192.168.240.0/21 dev weave proto kernel scope link src 192.168.240.1
root@kubemaster02:~# ip r
default via 192.168.128.1 dev ens3 proto static
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.128.0/21 dev ens3 proto kernel scope link src 192.168.129.138
在第二个主节点上运行的 weave pod 的描述
root@kubemaster01:~# kubectl describe pod/weave-net-bf8mq -n kube-system
Name: weave-net-bf8mq
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: kubemaster02./192.168.129.138
Start Time: Fri, 17 Apr 2020 12:28:09 -0300
Labels: controller-revision-hash=79478b764c
name=weave-net
pod-template-generation=1
Annotations: <none>
Status: Running
IP: 192.168.129.138
IPs:
IP: 192.168.129.138
Controlled By: DaemonSet/weave-net
Containers:
weave:
Container ID: docker://93bff012aaebb34dc338001bf73798b5eeefe32a4d50b82731b0ef003c63c786
Image: docker.io/weaveworks/weave-kube:2.6.2
Image ID: docker-pullable://weaveworks/weave-kube@sha256:a1f58e75f24f02e1c2fa2a95b9e55a1b94930f455e75bd5f4799e1a55671971f
Port: <none>
Host Port: <none>
Command:
/home/weave/launch.sh
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 17 Apr 2020 14:15:59 -0300
Finished: Fri, 17 Apr 2020 14:16:29 -0300
Ready: False
Restart Count: 39
Requests:
cpu: 10m
Readiness: http-get http://127.0.0.1:6784/status delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
HOSTNAME: (v1:spec.nodeName)
IPALLOC_RANGE: 192.168.240.0/21
Mounts:
/host/etc from cni-conf (rw)
/host/home from cni-bin2 (rw)
/host/opt from cni-bin (rw)
/host/var/lib/dbus from dbus (rw)
/lib/modules from lib-modules (rw)
/run/xtables.lock from xtables-lock (rw)
/var/run/secrets/kubernetes.io/serviceaccount from weave-net-token-xp46t (ro)
/weavedb from weavedb (rw)
weave-npc:
Container ID: docker://4de9116cae90cf3f6d59279dd1531938b102adcdd1b76464e5bbe2f2b013b060
Image: docker.io/weaveworks/weave-npc:2.6.2
Image ID: docker-pullable://weaveworks/weave-npc@sha256:5694b0b77003780333ccd1fc79810469434779cd86e926a17675cc5b70470459
Port: <none>
Host Port: <none>
State: Running
Started: Fri, 17 Apr 2020 12:28:24 -0300
Ready: True
Restart Count: 0
Requests:
cpu: 10m
Environment:
HOSTNAME: (v1:spec.nodeName)
Mounts:
/run/xtables.lock from xtables-lock (rw)
/var/run/secrets/kubernetes.io/serviceaccount from weave-net-token-xp46t (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
weavedb:
Type: HostPath (bare host directory volume)
Path: /var/lib/weave
HostPathType:
cni-bin:
Type: HostPath (bare host directory volume)
Path: /opt
HostPathType:
cni-bin2:
Type: HostPath (bare host directory volume)
Path: /home
HostPathType:
cni-conf:
Type: HostPath (bare host directory volume)
Path: /etc
HostPathType:
dbus:
Type: HostPath (bare host directory volume)
Path: /var/lib/dbus
HostPathType:
lib-modules:
Type: HostPath (bare host directory volume)
Path: /lib/modules
HostPathType:
xtables-lock:
Type: HostPath (bare host directory volume)
Path: /run/xtables.lock
HostPathType: FileOrCreate
weave-net-token-xp46t:
Type: Secret (a volume populated by a Secret)
SecretName: weave-net-token-xp46t
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoSchedule
:NoExecute
node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/network-unavailable:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/pid-pressure:NoSchedule
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unschedulable:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 11m (x17 over 81m) kubelet, kubemaster02. Container image "docker.io/weaveworks/weave-kube:2.6.2" already present on machine
Warning BackOff 85s (x330 over 81m) kubelet, kubemaster02. Back-off restarting failed container
日志文件抱怨超时,但那是因为没有网络在运行。
root@kubemaster02:~# kubectl logs weave-net-bf8mq -name weave -n kube-system
FATA: 2020/04/17 17:22:04.386233 [kube-peers] Could not get peers: Get https://192.168.244.1:443/api/v1/nodes: dial tcp 192.168.244.1:443: i/o timeout
Failed to get peers
root@kubemaster02:~# kubectl logs weave-net-bf8mq -name weave-npc -n kube-system | more
INFO: 2020/04/17 15:28:24.851287 Starting Weaveworks NPC 2.6.2; node name "kubemaster02"
INFO: 2020/04/17 15:28:24.851469 Serving /metrics on :6781
Fri Apr 17 15:28:24 2020 <5> ulogd.c:408 registering plugin `NFLOG'
Fri Apr 17 15:28:24 2020 <5> ulogd.c:408 registering plugin `BASE'
Fri Apr 17 15:28:24 2020 <5> ulogd.c:408 registering plugin `PCAP'
Fri Apr 17 15:28:24 2020 <5> ulogd.c:981 building new pluginstance stack: 'log1:NFLOG,base1:BASE,pcap1:PCAP'
WARNING: scheduler configuration failed: Function not implemented
DEBU: 2020/04/17 15:28:24.887619 Got list of ipsets: []
ERROR: logging before flag.Parse: E0417 15:28:54.923915 19321 reflector.go:205] github.com/weaveworks/weave/prog/weave-npc/main.go:321: Failed to list *v1.Pod: Get https://192.168.244.1:443/api/v1/pods?limit=500&resourceVersion=0: dial
tcp 192.168.244.1:443: i/o timeout
ERROR: logging before flag.Parse: E0417 15:28:54.923895 19321 reflector.go:205] github.com/weaveworks/weave/prog/weave-npc/main.go:322: Failed to list *v1.NetworkPolicy: Get https://192.168.244.1:443/apis/networking.k8s.io/v1/networkpo
licies?limit=500&resourceVersion=0: dial tcp 192.168.244.1:443: i/o timeout
ERROR: logging before flag.Parse: E0417 15:28:54.924071 19321 reflector.go:205] github.com/weaveworks/weave/prog/weave-npc/main.go:320: Failed to list *v1.Namespace: Get https://192.168.244.1:443/api/v1/namespaces?limit=500&resourceVer
sion=0: dial tcp 192.168.244.1:443: i/o timeout
有什么意见或建议吗?
问候。