Fresh Kubernetes (1.10.0) 集群使用 kubeadm (1.10.0) 安装在 RHEL7 裸机虚拟机上
Linux 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Dec 28 14:23:39 EST 2017 x86_64 x86_64 x86_64 GNU/Linux
kubeadm.x86_64 1.10.0-0 installed
kubectl.x86_64 1.10.0-0 installed
kubelet.x86_64 1.10.0-0 installed
kubernetes-cni.x86_64 0.6.0-0 installed
和 1.12 码头工人
docker-engine.x86_64 1.12.6-1.el7.centos installed
docker-engine-selinux.noarch 1.12.6-1.el7.centos installed
安装了 Flannel v0.9.1 pod 网络
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml
我运行的 kubeadm init 命令是
kubeadm init --pod-network-cidr=10.244.0.0/16 --kubernetes-version stable-1.10
它成功完成并且 kubeadm join 工作节点也成功。我可以在master上部署busybox pod并且nslookups成功,但是一旦我将任何东西部署到worker节点,我就会从master上的worker节点收到失败的API调用:
E0331 03:28:44.368253 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Service: Get https://172.30.0.85:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 172.30.0.85:6443: getsockopt: connection refused
E0331 03:28:44.368987 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: Get https://172.30.0.85:6443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 172.30.0.85:6443: getsockopt: connection refused
E0331 03:28:44.735886 1 event.go:209] Unable to write event: 'Post https://172.30.0.85:6443/api/v1/namespaces/default/events: dial tcp 172.30.0.85:6443: getsockopt: connection refused' (may retry after sleeping)
E0331 03:28:51.980131 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: endpoints is forbidden: User "system:serviceaccount:kube-system:kube-proxy" cannot list endpoints at the cluster scope
I0331 03:28:52.048995 1 controller_utils.go:1026] Caches are synced for service config controller
I0331 03:28:53.049005 1 controller_utils.go:1026] Caches are synced for endpoints config controller
和 nslookup 超时
kubectl exec -it busybox -- nslookup kubernetes
Server: 10.96.0.10
Address 1: 10.96.0.10
nslookup: can't resolve 'kubernetes'
command terminated with exit code 1
我在 stackoverflow 和 github 上查看了许多类似的帖子,似乎都可以通过设置 iptables -A FORWARD -j ACCEPT 来解决,但这次不是。我还包括了工作节点中的 iptables
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- anywhere anywhere /* kubernetes service portals */
DOCKER all -- anywhere anywhere ADDRTYPE match dst-type LOCAL
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- anywhere anywhere /* kubernetes service portals */
DOCKER all -- anywhere !loopback/8 ADDRTYPE match dst-type LOCAL
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
KUBE-POSTROUTING all -- anywhere anywhere /* kubernetes postrouting rules */
MASQUERADE all -- 172.17.0.0/16 anywhere
RETURN all -- 10.244.0.0/16 10.244.0.0/16
MASQUERADE all -- 10.244.0.0/16 !base-address.mcast.net/4
RETURN all -- !10.244.0.0/16 box2.ara.ac.nz/24
MASQUERADE all -- !10.244.0.0/16 10.244.0.0/16
Chain DOCKER (2 references)
target prot opt source destination
RETURN all -- anywhere anywhere
Chain KUBE-MARK-DROP (0 references)
target prot opt source destination
MARK all -- anywhere anywhere MARK or 0x8000
Chain KUBE-MARK-MASQ (6 references)
target prot opt source destination
MARK all -- anywhere anywhere MARK or 0x4000
Chain KUBE-NODEPORTS (1 references)
target prot opt source destination
Chain KUBE-POSTROUTING (1 references)
target prot opt source destination
MASQUERADE all -- anywhere anywhere /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
Chain KUBE-SEP-HZC4RESJCS322LXV (1 references)
target prot opt source destination
KUBE-MARK-MASQ all -- 10.244.0.18 anywhere /* kube-system/kube-dns:dns-tcp */
DNAT tcp -- anywhere anywhere /* kube-system/kube-dns:dns-tcp */ tcp to:10.244.0.18:53
Chain KUBE-SEP-JNNVSHBUREKVBFWD (1 references)
target prot opt source destination
KUBE-MARK-MASQ all -- 10.244.0.18 anywhere /* kube-system/kube-dns:dns */
DNAT udp -- anywhere anywhere /* kube-system/kube-dns:dns */ udp to:10.244.0.18:53
Chain KUBE-SEP-U3UDAUPXUG5BP2NG (2 references)
target prot opt source destination
KUBE-MARK-MASQ all -- box1.ara.ac.nz anywhere /* default/kubernetes:https */
DNAT tcp -- anywhere anywhere /* default/kubernetes:https */ recent: SET name: KUBE-SEP-U3UDAUPXUG5BP2NG side: source mask: 255.255.255.255 tcp to:172.30.0.85:6443
Chain KUBE-SERVICES (2 references)
target prot opt source destination
KUBE-MARK-MASQ tcp -- !10.244.0.0/16 10.96.0.1 /* default/kubernetes:https cluster IP */ tcp dpt:https
KUBE-SVC-NPX46M4PTMTKRN6Y tcp -- anywhere 10.96.0.1 /* default/kubernetes:https cluster IP */ tcp dpt:https
KUBE-MARK-MASQ udp -- !10.244.0.0/16 10.96.0.10 /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-SVC-TCOU7JCQXEZGVUNU udp -- anywhere 10.96.0.10 /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-MARK-MASQ tcp -- !10.244.0.0/16 10.96.0.10 /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-SVC-ERIFXISQEP7F7OF4 tcp -- anywhere 10.96.0.10 /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-NODEPORTS all -- anywhere anywhere /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL
Chain KUBE-SVC-ERIFXISQEP7F7OF4 (1 references)
target prot opt source destination
KUBE-SEP-HZC4RESJCS322LXV all -- anywhere anywhere /* kube-system/kube-dns:dns-tcp */
Chain KUBE-SVC-NPX46M4PTMTKRN6Y (1 references)
target prot opt source destination
KUBE-SEP-U3UDAUPXUG5BP2NG all -- anywhere anywhere /* default/kubernetes:https */ recent: CHECK seconds: 10800 reap name: KUBE-SEP-U3UDAUPXUG5BP2NG side: source mask: 255.255.255.255
KUBE-SEP-U3UDAUPXUG5BP2NG all -- anywhere anywhere /* default/kubernetes:https */
Chain KUBE-SVC-TCOU7JCQXEZGVUNU (1 references)
target prot opt source destination
KUBE-SEP-JNNVSHBUREKVBFWD all -- anywhere anywhere /* kube-system/kube-dns:dns */
Chain WEAVE (0 references)
target prot opt source destination
Chain cali-OUTPUT (0 references)
target prot opt source destination
Chain cali-POSTROUTING (0 references)
target prot opt source destination
Chain cali-PREROUTING (0 references)
target prot opt source destination
Chain cali-fip-dnat (0 references)
target prot opt source destination
Chain cali-fip-snat (0 references)
target prot opt source destination
Chain cali-nat-outgoing (0 references)
target prot opt source destination
我还可以看到包在法兰绒接口上被丢弃
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 10.244.1.0 netmask 255.255.255.255 broadcast 0.0.0.0
inet6 fe80::a096:47ff:fe58:e438 prefixlen 64 scopeid 0x20<link>
ether a2:96:47:58:e4:38 txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 198 bytes 14747 (14.4 KiB)
TX errors 0 dropped 27 overruns 0 carrier 0 collisions 0
我已经在其他虚拟机上安装了相同版本的 Kubernetes/Docker 和 Flannel 并且它可以工作,但不知道为什么我会在此安装中从工作节点对主代理进行这些失败的 API 调用?我进行了几次全新安装,并尝试了 weave 和 calico pod 网络,结果相同。