4

我正在尝试将我的示例 Spring Boot 微服务部署到 Kubernetes 集群中。我的每个节点都显示就绪状态。当我尝试部署时,我的 pod 只显示ContainerCreating.

当我描述 pod 时,我会说 networkPlugin cni failed to set up pod and network unable to allocate IP address.

我的pod describe命令结果如下:

Events:
 Type     Reason                  Age                    From                   Message
  ----     ------                  ----                   ----                   -------
  Normal   Scheduled               <unknown>              default-scheduler      Successfully assigned 
default/spacestudysecurityauthcontrol-deployment-57596f4795-jxxvj to mildevkub040
  Warning  FailedCreatePodSandBox  53m                    kubelet, mildevkub040  Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "2499f91b4a1173fb854a47ba1910d1fc3f18cfb35bf5c38c9a3008e19d385e15" network for pod "spacestudysecurityauthcontrol-deployment-57596f4795-jxxvj": networkPlugin cni failed to set up pod "spacestudysecurityauthcontrol-deployment-57596f4795-jxxvj_default" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/2499f91b4a1173fb854a47ba1910d1fc3f18cfb35bf5c38c9a3008e19d385e15: dial tcp 127.0.0.1:6784: connect: connection refused, failed to clean up sandbox container "2499f91b4a1173fb854a47ba1910d1fc3f18cfb35bf5c38c9a3008e19d385e15" network for pod "spacestudysecurityauthcontrol-deployment-57596f4795-jxxvj": networkPlugin cni failed to teardown pod "spacestudysecurityauthcontrol-deployment-57596f4795-jxxvj_default" network: Delete http://127.0.0.1:6784/ip/2499f91b4a1173fb854a47ba1910d1fc3f18cfb35bf5c38c9a3008e19d385e15: dial tcp 127.0.0.1:6784: connect: connection refused]
  Normal   SandboxChanged          3m40s (x228 over 53m)  kubelet, mildevkub040  Pod sandbox changed, it will be killed and re-created.

当我检查容器编织日志时,我得到如下所示,

INFO: 2020/01/09 12:18:12.061328 ->[192.168.16.178:42838] connection shutting down due to error during handshake: write tcp 192.168.16.177:6783->192.168.16.178:42838: write: connection reset by peer
INFO: 2020/01/09 12:18:18.998360 ->[192.168.16.178:37570] connection accepted
INFO: 2020/01/09 12:18:20.653339 ->[192.168.16.178:45223] connection shutting down due to error during handshake: write tcp 192.168.16.177:6783->192.168.16.178:45223: write: connection reset by peer
INFO: 2020/01/09 12:18:21.122204 overlay_switch ->[56:60:12:a9:76:d1(mildevkub050)] using fastdp
INFO: 2020/01/09 12:18:21.742168 ->[192.168.16.178:6783|56:60:12:a9:76:d1(mildevkub050)]: connection deleted
INFO: 2020/01/09 12:18:21.800670 ->[192.168.16.178:6783] attempting connection
INFO: 2020/01/09 12:18:22.470207 ->[192.168.16.175:59923] connection accepted
INFO: 2020/01/09 12:18:22.912690 ->[192.168.16.175:6783|be:b1:3f:a4:34:88(mildevkub020)]: connection deleted
INFO: 2020/01/09 12:18:22.918075 Removed unreachable peer be:b1:3f:a4:34:88(mildevkub020)
INFO: 2020/01/09 12:18:22.918144 Removed unreachable peer 56:60:12:a9:76:d1(mildevkub050)
INFO: 2020/01/09 12:18:24.602093 ->[192.168.16.175:6783] attempting connection
INFO: 2020/01/09 12:18:26.782123 ->[192.168.16.178:6783|56:60:12:a9:76:d1(mildevkub050)]: connection ready; using protocol version 2
INFO: 2020/01/09 12:18:27.918518 ->[192.168.16.175:59923|be:b1:3f:a4:34:88(mildevkub020)]: connection ready; using protocol version 2
INFO: 2020/01/09 12:18:29.365629 ->[192.168.16.178:37570|56:60:12:a9:76:d1(mildevkub050)]: connection ready; using protocol version 2
INFO: 2020/01/09 12:18:29.864370 overlay_switch ->[56:60:12:a9:76:d1(mildevkub050)] using fastdp
INFO: 2020/01/09 12:18:30.086645 overlay_switch ->[56:60:12:a9:76:d1(mildevkub050)] using fastdp
INFO: 2020/01/09 12:18:30.090275 overlay_switch ->[be:b1:3f:a4:34:88(mildevkub020)] using fastdp
INFO: 2020/01/09 12:18:30.100874 ->[192.168.16.178:37570|56:60:12:a9:76:d1(mildevkub050)]: connection added (new peer)
INFO: 2020/01/09 12:18:30.104237 ->[192.168.16.178:37570|56:60:12:a9:76:d1(mildevkub050)]: connection deleted
INFO: 2020/01/09 12:18:30.104284 ->[192.168.16.178:6783|56:60:12:a9:76:d1(mildevkub050)]: connection added (new peer)
INFO: 2020/01/09 12:18:30.104371 ->[192.168.16.175:59923|be:b1:3f:a4:34:88(mildevkub020)]: connection added (new peer)
INFO: 2020/01/09 12:18:30.776275 ->[192.168.16.178:37570|56:60:12:a9:76:d1(mildevkub050)]: connection shutting down due to error: Multiple connections to 56:60:12:a9:76:d1(mildevkub050) added to 5a:67:92:b3:58:ce(mildevkub040)
INFO: 2020/01/09 12:18:44.305079 ->[192.168.16.175:6783|be:b1:3f:a4:34:88(mildevkub020)]: connection ready; using protocol version 2
INFO: 2020/01/09 12:18:45.200565 overlay_switch ->[be:b1:3f:a4:34:88(mildevkub020)] using fastdp
INFO: 2020/01/09 12:18:45.458203 ->[192.168.16.175:59923|be:b1:3f:a4:34:88(mildevkub020)]: connection fully established
INFO: 2020/01/09 12:18:45.461157 ->[192.168.16.175:6783|be:b1:3f:a4:34:88(mildevkub020)]: connection shutting down due to error: Multiple connections to be:b1:3f:a4:34:88(mildevkub020) added to 5a:67:92:b3:58:ce(mildevkub040)
INFO: 2020/01/09 12:18:45.470667 ->[192.168.16.178:6783|56:60:12:a9:76:d1(mildevkub050)]: connection fully established
INFO: 2020/01/09 12:18:45.688871 sleeve ->[192.168.16.178:6783|56:60:12:a9:76:d1(mildevkub050)]: Effective MTU verified at 1438
INFO: 2020/01/09 12:18:45.874380 sleeve ->[192.168.16.175:6783|be:b1:3f:a4:34:88(mildevkub020)]: Effective MTU verified at 1438
INFO: 2020/01/09 12:24:12.026645 ->[192.168.16.178:6783|56:60:12:a9:76:d1(mildevkub050)]: connection shutting down due to error: write tcp 192.168.16.177:38313->192.168.16.178:6783: write: connection reset by peer
INFO: 2020/01/09 12:25:56.708405 ->[192.168.16.178:44120] connection accepted
INFO: 2020/01/09 12:26:31.769826 overlay_switch ->[56:60:12:a9:76:d1(mildevkub050)] sleeve timed out waiting for UDP heartbeat
INFO: 2020/01/09 12:26:41.819554 ->[192.168.16.175:59923|be:b1:3f:a4:34:88(mildevkub020)]: connection shutting down due to error: write tcp 192.168.16.177:6783->192.168.16.175:59923: write: connection reset by peer
INFO: 2020/01/09 12:28:17.563133 ->[192.168.16.178:6783|56:60:12:a9:76:d1(mildevkub050)]: connection deleted
INFO: 2020/01/09 12:30:49.548347 ->[192.168.16.178:60937] connection accepted

当我运行命令 kubectl exec -n kube-system weave-net-fj9mm -c weave -- /home/weave/weave --local status ipam时,我收到类似“来自服务器的错误(未找到):未找到豆荚“weave-net-fj9mm””的响应

我该如何解决这个问题?

4

1 回答 1

1

出现在pod describe命令中的 url,如果你 curl 它。你会得到这样的东西。

# curl 'http://127.0.0.1:6784/status'
        Version: 1.8.2 (version 1.9.1 available - please upgrade!)

        Service: router
       Protocol: weave 1..2
           Name: 66:2b:6a:ca:34:88(ip-10-128-152-185)
     Encryption: disabled
  PeerDiscovery: enabled
        Targets: 4
    Connections: 4 (3 established, 1 failed)
          Peers: 4 (with 12 established connections)
 TrustedSubnets: none

        Service: ipam
         Status: waiting for IP range grant from peers
          Range: 10.32.0.0/12
  DefaultSubnet: 10.32.0.0/12

“waiting for IP range grant from peers”状态表示 Weave Net 的 IPAM 认为所有 IP 地址空间都归集群中的其他节点所有,但实际上这些节点目前都无法联系到。

这是解决方法。大红色警告:

  • 所有无法访问的主机首先被确定为永远消失。
  • 不要在多个节点上运行它。
  • 如果出现问题,这可能会破坏您的 Kubernetes 集群。
  • 如果您没有阅读上述警告,命令中会添加一个故障安全“回声”。
% for i in $(curl -s 'http://127.0.0.1:6784/status/ipam' | grep 'unreachable\!$' | sort -k2 -n -r | awk -F'(' '{print $2}' | sed 's/).*//'); do echo curl -X DELETE 127.0.0.1:6784/peer/$i; done
65536 IPs taken over from ip-10-128-184-15
32768 IPs taken over from ip-10-128-159-154
32768 IPs taken over from ip-10-128-170-84

参考 - https://github.com/weaveworks/weave/issues/2822

于 2020-02-15T21:58:14.727 回答