docker - Kubernetes Multus：不同节点上的 pod 之间没有 macvlan 连接（无法 ping）

Question

我有一个问题，我有一个带有两个工作节点和一个主节点的 Kubernetes 集群。让我们对它们进行 W1、W2 和 M 的实验。我有一个部署，它创建了一组 CentOS7 pod，每个工人都有一些。我使用 Multus，以便在每个 pod 上都有一个额外的 net1 接口，该接口映射到 worker 上的 eth1。所有的 pod 都有 net1 连接到同一个名为“up-net”的 macvlan。

在 W1 和 W2 上，我可以在同一节点上运行的 pod 之间 ping，但 W1 中的 pod 无法 ping W2 中的另一个 pod，反之亦然。ping eth0 上的标准 kube 网络在所有情况下都有效。只是macvlan有这个问题。

简而言之，这就是问题所在。现在让我更详细地描述我们正在使用的设置。

我们有一个带有 3 台物理服务器的实验室，我们在上面部署了 Kolla（它是安装在 Kubernets 上的 Openstack）。在这个 Openstack 安装中，我再次尝试设置 Kubernetes 安装，主节点和工作节点托管在 Openstack 虚拟机（即 W1、W2、M）中，VM 运行在 Openstack 中。这意味着我们总共有三层虚拟化。只是想提一下，如果有人知道基于此的任何潜在线索。但我没有遇到任何我认为与虚拟化有关的问题。还可以提一下，这些 vm 有两个接口 eth0 和 eth1。Eth1 是我想要 macvlan 的设备。最后，对于虚拟机和物理服务器，操作系统都是 CentOS7。

关于 Kubernetes 安装：

Kubernetes (overcloud) 是使用 Kubespray 安装的。
我编辑了主机文件，使 node1 成为主 node2 W1 和 node3 W2。
我将 kube_network_plugin_multus 设置为 true 。
Whereabouts 用于为 net1 接口分配 IP 地址。
我使用 calico 作为网络驱动程序。

以下是用于 macvlan 网络的配置：

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: up-net
spec:
  config: '{
      "cniVersion": "0.3.0",
      "name": "up-net",
      "type": "macvlan",
      "master": "eth1",
      "mode": "bridge",
      "ipam": {
        "type": "whereabouts",
        "datastore": "kubernetes",
        "kubernetes": { "kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig" },
        "range": "192.168.3.225/28",
        "log_file" : "/tmp/whereabouts.log",
        "log_level" : "debug"
      }
    }'

这是 Pod 的配置：

apiVersion: apps/v1
kind: Deployment
metadata:
    name: sample
    labels:
        app: centos-host
spec:
    replicas: 4
    selector:
        matchLabels:
            app: centos-host
    template:
        metadata:
            labels:
                app: centos-host
            annotations:
                k8s.v1.cni.cncf.io/networks: up-net
        spec:
            containers:
              - name: centos-container
                image: centos:7
                command: ["/bin/sleep", "infinity"]

我没有明确指定他们最终在哪个工作人员上，但通常负载均衡器会平均分配四个 pod。

此外，这里是 Kube 系统 pod：

[centos@node1 ~]$ kubectl get pods -n kube-system
NAME                                      READY   STATUS    RESTARTS   AGE
calico-kube-controllers-8b5ff5d58-msq2m   1/1     Running   1          29h
calico-node-2kg2l                         1/1     Running   1          29h
calico-node-4fxwr                         1/1     Running   1          29h
calico-node-m4l67                         1/1     Running   1          29h
coredns-85967d65-6ksqx                    1/1     Running   1          29h
coredns-85967d65-8nbgq                    1/1     Running   1          29h
dns-autoscaler-5b7b5c9b6f-567vz           1/1     Running   1          29h
kube-apiserver-node1                      1/1     Running   1          29h
kube-controller-manager-node1             1/1     Running   1          29h
kube-multus-ds-amd64-dzmj5                1/1     Running   1          29h
kube-multus-ds-amd64-mvfpc                1/1     Running   1          29h
kube-multus-ds-amd64-sbw8n                1/1     Running   1          29h
kube-proxy-6jgvn                          1/1     Running   1          29h
kube-proxy-tzf5t                          1/1     Running   1          29h
kube-proxy-vgmh8                          1/1     Running   1          29h
kube-scheduler-node1                      1/1     Running   1          29h
nginx-proxy-node2                         1/1     Running   1          29h
nginx-proxy-node3                         1/1     Running   1          29h
nodelocaldns-27bct                        1/1     Running   1          29h
nodelocaldns-75cgg                        1/1     Running   1          29h
nodelocaldns-ftvn9                        1/1     Running   1          29h
whereabouts-4tktv                         1/1     Running   0          28h
whereabouts-nfwkz                         1/1     Running   0          28h
whereabouts-vxgwr                         1/1     Running   0          28h

现在已经对我运行的实验解释了设置。

考虑工人 1 (W1) 上的 pod P1a 和 P1b。在工人 2 (W2) 上有 P2a 和 P2b。我使用 ping 和 tcpdump 来访问连接。

从 P1a 到 P1b 的 Ping 工作正常，tcpdump 告诉我 W1 的 eth1 设备上有 icmp 流量。W2也是如此。

但是，当我从 P1a ping P2a 时，它看起来如下所示：

[root@sample-7b9755db48-gxq5m /]# ping -c 2 192.168.3.228
PING 192.168.3.228 (192.168.3.228) 56(84) bytes of data.
From 192.168.3.227 icmp_seq=1 Destination Host Unreachable
From 192.168.3.227 icmp_seq=2 Destination Host Unreachable

--- 192.168.3.228 ping statistics ---
2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1000ms
pipe 2

然而，一个有趣的线索是，在这种情况下，icmp 数据包最终会出现在 pod 的 lo 接口上：

[root@sample-7b9755db48-gxq5m /]# tcpdump -vnes0 -i lo
tcpdump: listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
12:51:57.261003 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 126: (tos 0xc0, ttl 64, id 32401, offset 0, flags [none], proto ICMP (1), length 112)
    192.168.3.227 > 192.168.3.227: ICMP host 192.168.3.228 unreachable, length 92
        (tos 0x0, ttl 64, id 39033, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.3.227 > 192.168.3.228: ICMP echo request, id 137, seq 1, length 64
12:51:57.261019 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 126: (tos 0xc0, ttl 64, id 32402, offset 0, flags [none], proto ICMP (1), length 112)
    192.168.3.227 > 192.168.3.227: ICMP host 192.168.3.228 unreachable, length 92
        (tos 0x0, ttl 64, id 39375, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.3.227 > 192.168.3.228: ICMP echo request, id 137, seq 2, length 64

你认为我的路由表可能有问题吗？我什么都看不到，但我对网络有点陌生：

[root@sample-7b9755db48-gxq5m /]# ip route
default via 169.254.1.1 dev eth0 
169.254.1.1 dev eth0 scope link 
192.168.3.224/28 dev net1 proto kernel scope link src 192.168.3.227

最后，列出我尝试过但不起作用的事情的清单：

在 W1、W2 和 M 上的 eth1 上将 eth1 设置为混杂模式。
为 ipv4 禁用了 rp_filter（因为我发现 macvlan 对 macaddresses 做了奇怪的事情）。

score 0 · Accepted Answer

总而言之，我自己设法找到了答案。原来是 OpenStack 安全组导致了这个问题。我需要做的改变就是禁用所有 eth1-network 端口上的端口安全性。这是我用于每个此类端口的命令：

openstack port set --no-security-group --disable-port-security <id or name of the neutron port>

在那之后，机器就可以到达了。无需重新启动服务器或服务等。

我确实觉得这个问题只发生在辅助网络上有点奇怪。无论哪种情况，希望这可以帮助其他尝试在 openstack VM 中运行 kubernetes 的人。

docker - Kubernetes Multus：不同节点上的 pod 之间没有 macvlan 连接（无法 ping）

1 回答 1

Related

Reference