0

如何解决此问题?

我有一个 Kubernetes 的手动设置,它用作集群内部 DNS,coredns。已经部署了一个 busybox pod 来对kubernetes.default进行 nslookup 。

查找失败并显示消息nslookup: can't resolve 'kubernetes.default。为了更深入地了解查找过程中发生的情况,我使用 tcpdump 从我的 busybox pod 中检查了网络流量。这表明我的 pod 可以成功连接到 coredns pod,但 coredns pod 将无法连接回来:

10:25:53.328153 IP 10.200.0.29.49598 > 10.32.0.10.domain: 2+ PTR? 10.0.32.10.in-addr.arpa. (41)
10:25:53.328393 IP 10.200.0.30.domain > 10.200.0.29.49598: 2* 1/0/0 PTR kube-dns.kube-system.svc.cluster.local. (93)
10:25:53.328410 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 49598 unreachable, length 129
10:25:58.328516 IP 10.200.0.29.50899 > 10.32.0.10.domain: 3+ PTR? 10.0.32.10.in-addr.arpa. (41)
10:25:58.328738 IP 10.200.0.30.domain > 10.200.0.29.50899: 3* 1/0/0 PTR kube-dns.kube-system.svc.cluster.local. (93)
10:25:58.328752 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 50899 unreachable, length 129
10:25:58.343205 ARP, Request who-has 10.200.0.1 tell 10.200.0.29, length 28
10:25:58.343217 ARP, Reply 10.200.0.1 is-at 0a:58:0a:c8:00:01 (oui Unknown), length 28
10:25:58.351250 ARP, Request who-has 10.200.0.29 tell 10.200.0.30, length 28
10:25:58.351250 ARP, Request who-has 10.200.0.30 tell 10.200.0.29, length 28
10:25:58.351261 ARP, Reply 10.200.0.29 is-at 0a:58:0a:c8:00:1d (oui Unknown), length 28
10:25:58.351262 ARP, Reply 10.200.0.30 is-at 0a:58:0a:c8:00:1e (oui Unknown), length 28
10:26:03.331409 IP 10.200.0.29.45823 > 10.32.0.10.domain: 4+ PTR? 10.0.32.10.in-addr.arpa. (41)
10:26:03.331618 IP 10.200.0.30.domain > 10.200.0.29.45823: 4* 1/0/0 PTR kube-dns.kube-system.svc.cluster.local. (93)
10:26:03.331631 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 45823 unreachable, length 129
10:26:08.348259 IP 10.200.0.29.43332 > 10.32.0.10.domain: 5+ PTR? 10.0.32.10.in-addr.arpa. (41)
10:26:08.348492 IP 10.200.0.30.domain > 10.200.0.29.43332: 5* 1/0/0 PTR kube-dns.kube-system.svc.cluster.local. (93)
10:26:08.348506 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 43332 unreachable, length 129
10:26:13.353491 IP 10.200.0.29.55715 > 10.32.0.10.domain: 6+ AAAA? kubernetes.default. (36)
10:26:13.354955 IP 10.200.0.30.domain > 10.200.0.29.55715: 6 NXDomain* 0/0/0 (36)
10:26:13.354971 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 55715 unreachable, length 72
10:26:18.354285 IP 10.200.0.29.57421 > 10.32.0.10.domain: 7+ AAAA? kubernetes.default. (36)
10:26:18.355533 IP 10.200.0.30.domain > 10.200.0.29.57421: 7 NXDomain* 0/0/0 (36)
10:26:18.355550 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 57421 unreachable, length 72
10:26:23.359405 IP 10.200.0.29.44332 > 10.32.0.10.domain: 8+ AAAA? kubernetes.default. (36)
10:26:23.361155 IP 10.200.0.30.domain > 10.200.0.29.44332: 8 NXDomain* 0/0/0 (36)
10:26:23.361171 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 44332 unreachable, length 72
10:26:23.367220 ARP, Request who-has 10.200.0.30 tell 10.200.0.29, length 28
10:26:23.367232 ARP, Reply 10.200.0.30 is-at 0a:58:0a:c8:00:1e (oui Unknown), length 28
10:26:23.370352 ARP, Request who-has 10.200.0.1 tell 10.200.0.29, length 28
10:26:23.370363 ARP, Reply 10.200.0.1 is-at 0a:58:0a:c8:00:01 (oui Unknown), length 28
10:26:28.367698 IP 10.200.0.29.48446 > 10.32.0.10.domain: 9+ AAAA? kubernetes.default. (36)
10:26:28.369133 IP 10.200.0.30.domain > 10.200.0.29.48446: 9 NXDomain* 0/0/0 (36)
10:26:28.369149 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 48446 unreachable, length 72
10:26:33.381266 IP 10.200.0.29.50714 > 10.32.0.10.domain: 10+ A? kubernetes.default. (36)
10:26:33.382745 IP 10.200.0.30.domain > 10.200.0.29.50714: 10 NXDomain* 0/0/0 (36)
10:26:33.382762 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 50714 unreachable, length 72
10:26:38.386288 IP 10.200.0.29.39198 > 10.32.0.10.domain: 11+ A? kubernetes.default. (36)
10:26:38.388635 IP 10.200.0.30.domain > 10.200.0.29.39198: 11 NXDomain* 0/0/0 (36)
10:26:38.388658 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 39198 unreachable, length 72
10:26:38.395241 ARP, Request who-has 10.200.0.29 tell 10.200.0.30, length 28
10:26:38.395248 ARP, Reply 10.200.0.29 is-at 0a:58:0a:c8:00:1d (oui Unknown), length 28
10:26:43.389355 IP 10.200.0.29.46495 > 10.32.0.10.domain: 12+ A? kubernetes.default. (36)
10:26:43.391522 IP 10.200.0.30.domain > 10.200.0.29.46495: 12 NXDomain* 0/0/0 (36)
10:26:43.391539 IP 10.200.0.2

集群基础设施

NAMESPACE     NAME             DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
default       deploy/busybox   1         1         1            1           1h
kube-system   deploy/coredns   1         1         1            1           17h

NAMESPACE     NAME                    DESIRED   CURRENT   READY     AGE
default       rs/busybox-56db8bd9d7   1         1         1         1h
kube-system   rs/coredns-b8d4b46c8    1         1         1         17h

NAMESPACE     NAME             DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
default       deploy/busybox   1         1         1            1           1h
kube-system   deploy/coredns   1         1         1            1           17h

NAMESPACE     NAME                    DESIRED   CURRENT   READY     AGE
default       rs/busybox-56db8bd9d7   1         1         1         1h
kube-system   rs/coredns-b8d4b46c8    1         1         1         17h

NAMESPACE     NAME                          READY     STATUS    RESTARTS   AGE
default       po/busybox-56db8bd9d7-fv7np   1/1       Running   2          1h
kube-system   po/coredns-b8d4b46c8-6tg5d    1/1       Running   2          17h

NAMESPACE     NAME             TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
default       svc/kubernetes   ClusterIP   10.32.0.1    <none>        443/TCP                  22h
kube-system   svc/kube-dns     ClusterIP   10.32.0.10   <none>        53/UDP,53/TCP,9153/TCP   17h

Busybox IP

kubectl describe pod busybox-56db8bd9d7-fv7np | grep IP
IP:             10.200.0.29

EndPoints IP 查看 DNS IP 和端口

kubectl get endpoints --all-namespaces
NAMESPACE     NAME                      ENDPOINTS                                        AGE
default       kubernetes                192.168.0.218:6443                               22h
kube-system   kube-controller-manager   <none>                                           22h
kube-system   kube-dns                  10.200.0.30:9153,10.200.0.30:53,10.200.0.30:53   2h
kube-system   kube-scheduler            <none>                                           22h
4

1 回答 1

1

调试这个需要几个步骤来确保你拥有所有的地面覆盖物。

首先启动一个 pod(可以是 busybox 或其他),它会有一些工具,比如host,dignslookup.

接下来,识别 coredns 的 POD IP。有了这个,继续说host kubernetes.default.svc.cluster.local <podIP>。如果这不起作用,则集群中的 pod 到 pod 连接有问题。

如果是,请尝试host kubernetes.default.svc.cluster.local <service IP>使用您的 dns 服务的服务 IP。如果它不起作用,那么看起来 kube-proxy 没有正常工作,或者在 iptables 级别上出现了问题。

如果有效,请查看 pod 中的 /etc/resolv.conf 和 kubelet --cluster-dns 标志值。

旁注:以上所有内容都假设您的 coredns 首先可以正常工作

于 2017-11-30T10:54:30.610 回答