安装过程
我是 Kubernetes 的新手,目前在 Azure VM 中设置了一个 Kubernetes 集群。我想部署 Windows 容器,但为了实现这一点,我需要添加 Windows 工作节点。我已经部署了一个包含 3 个主节点和一个 Linux 工作节点的 Kubeadm 集群,这些节点运行良好。
一旦我添加了 Windows 节点,所有的事情都会下降。首先,我使用 Flannel 作为我的 CNI 插件,并根据 Kubernetes 文档准备 deamonset 和控制平面:https ://kubernetes.io/docs/tasks/administer-cluster/kubeadm/adding-windows-nodes/
然后在安装 Flannel deamonset 之后,我相应地安装了代理和 Docker EE。
二手软件
主节点
操作系统:Ubuntu 18.04 LTS
容器运行时:Docker 20.10.5
Kubernetes 版本:1.21.0
Flannel-image 版本:0.14.0
Kube-proxy 版本:1.21.0
Windows 工作节点
操作系统:Windows Server 2019 Datacenter Core
容器运行时:Docker 20.10.4
Kubernetes 版本:1.21.0
Flannel-image 版本:0.13.0-nanoserver
Kube-proxy 版本:1.21.0-nanoserver
想要的结果:
我希望看到一个完整的集群可供使用,并且在该Running
状态下拥有所有需要的东西。
当前结果:
安装后我检查是否安装成功:
azureuser@Kube-M-001:~$ kubectl get pods -o wide -n kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-558bd4d5db-8mshg 1/1 Running 0 178m 10.244.0.3 kube-m-001 <none> <none>
coredns-558bd4d5db-xhsmn 1/1 Running 0 178m 10.244.0.2 kube-m-001 <none> <none>
etcd-kube-m-001 1/1 Running 0 178m 10.0.10.4 kube-m-001 <none> <none>
etcd-kube-m-002 1/1 Running 0 164m 10.0.10.5 kube-m-002 <none> <none>
etcd-kube-m-003 1/1 Running 0 162m 10.0.10.6 kube-m-003 <none> <none>
kube-apiserver-kube-m-001 1/1 Running 0 178m 10.0.10.4 kube-m-001 <none> <none>
kube-apiserver-kube-m-002 1/1 Running 1 165m 10.0.10.5 kube-m-002 <none> <none>
kube-apiserver-kube-m-003 1/1 Running 0 162m 10.0.10.6 kube-m-003 <none> <none>
kube-controller-manager-kube-m-001 1/1 Running 1 178m 10.0.10.4 kube-m-001 <none> <none>
kube-controller-manager-kube-m-002 1/1 Running 0 165m 10.0.10.5 kube-m-002 <none> <none>
kube-controller-manager-kube-m-003 1/1 Running 0 163m 10.0.10.6 kube-m-003 <none> <none>
kube-flannel-ds-5lwzf 1/1 Running 0 165m 10.0.10.5 kube-m-002 <none> <none>
kube-flannel-ds-6lvgp 1/1 Running 0 129m 10.0.10.7 kube-w-001 <none> <none>
kube-flannel-ds-dlmkt 1/1 Running 0 163m 10.0.10.6 kube-m-003 <none> <none>
kube-flannel-ds-h27r7 1/1 Running 0 169m 10.0.10.4 kube-m-001 <none> <none>
kube-flannel-ds-windows-amd64-hwbjc 1/1 Running 0 121m 10.0.64.4 kube-w-002 <none> <none>
kube-proxy-4rkgk 1/1 Running 0 178m 10.0.10.4 kube-m-001 <none> <none>
kube-proxy-6g4sb 1/1 Running 0 129m 10.0.10.7 kube-w-001 <none> <none>
kube-proxy-tvm9g 1/1 Running 0 165m 10.0.10.5 kube-m-002 <none> <none>
kube-proxy-windows-j7c27 0/1 CrashLoopBackOff 26 121m 10.244.4.2 kube-w-002 <none> <none>
kube-proxy-wzjm7 1/1 Running 0 163m 10.0.10.6 kube-m-003 <none> <none>
kube-scheduler-kube-m-001 1/1 Running 1 178m 10.0.10.4 kube-m-001 <none> <none>
kube-scheduler-kube-m-002 1/1 Running 0 165m 10.0.10.5 kube-m-002 <none> <none>
kube-scheduler-kube-m-003 1/1 Running 0 162m 10.0.10.6 kube-m-003 <none> <none>
我检查了特定 kube-proxy pod 的日志,得到以下结果:
azureuser@Kube-M-001:~$ kubectl logs -n kube-system kube-proxy-windows-j7c27 -p
Directory: C:\host\var\lib\kube-proxy\var\run\secrets\kubernetes.io
Mode LastWriteTime Length Name
---- ------------- ------ ----
d---- 5/3/2021 12:08 PM serviceaccount
Directory: C:\host\k
Mode LastWriteTime Length Name
---- ------------- ------ ----
d---- 5/3/2021 12:24 PM kube-proxy
Using CNI conf file: 10-flannel.conf
I0503 12:30:23.146002 2448 flags.go:59] FLAG: --add-dir-header="false"
I0503 12:30:23.194891 2448 flags.go:59] FLAG: --alsologtostderr="false"
I0503 12:30:23.194891 2448 flags.go:59] FLAG: --bind-address="0.0.0.0"
I0503 12:30:23.194891 2448 flags.go:59] FLAG: --bind-address-hard-fail="false"
I0503 12:30:23.194891 2448 flags.go:59] FLAG: --cleanup="false"
I0503 12:30:23.194891 2448 flags.go:59] FLAG: --cluster-cidr=""
I0503 12:30:23.194891 2448 flags.go:59] FLAG: --config="/var/lib/kube-proxy/config.conf"
I0503 12:30:23.194891 2448 flags.go:59] FLAG: --config-sync-period="15m0s"
I0503 12:30:23.194891 2448 flags.go:59] FLAG: --conntrack-max-per-core="32768"
I0503 12:30:23.194891 2448 flags.go:59] FLAG: --conntrack-min="131072"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --conntrack-tcp-timeout-close-wait="1h0m0s"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --conntrack-tcp-timeout-established="24h0m0s"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --detect-local-mode=""
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --enable-dsr="false"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --feature-gates="WinOverlay=true"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --healthz-bind-address="0.0.0.0:10256"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --healthz-port="10256"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --help="false"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --hostname-override="kube-w-002"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --iptables-masquerade-bit="14"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --iptables-min-sync-period="1s"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --iptables-sync-period="30s"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --ipvs-exclude-cidrs="[]"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --ipvs-min-sync-period="0s"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --ipvs-scheduler=""
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --ipvs-strict-arp="false"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --ipvs-sync-period="30s"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --ipvs-tcp-timeout="0s"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --ipvs-tcpfin-timeout="0s"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --ipvs-udp-timeout="0s"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --kube-api-burst="10"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --kube-api-content-type="application/vnd.kubernetes.protobuf"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --kube-api-qps="5"
I0503 12:30:23.195318 2448 flags.go:59] FLAG: --kubeconfig=""
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --log-backtrace-at=":0"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --log-dir=""
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --log-file=""
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --log-file-max-size="1800"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --log-flush-frequency="5s"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --logtostderr="true"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --masquerade-all="false"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --master=""
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --metrics-bind-address="127.0.0.1:10249"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --metrics-port="10249"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --network-name=""
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --nodeport-addresses="[]"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --one-output="false"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --oom-score-adj="-999"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --profiling="false"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --proxy-mode=""
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --proxy-port-range=""
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --show-hidden-metrics-for-version=""
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --skip-headers="false"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --skip-log-headers="false"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --source-vip=""
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --stderrthreshold="2"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --udp-timeout="250ms"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --v="6"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --version="false"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --vmodule=""
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --windows-service="false"
I0503 12:30:23.195454 2448 flags.go:59] FLAG: --write-config-to=""
I0503 12:30:23.197789 2448 feature_gate.go:243] feature gates: &{map[WinOverlay:true]}
I0503 12:30:23.197789 2448 feature_gate.go:243] feature gates: &{map[WinOverlay:true]}
I0503 12:30:23.200622 2448 loader.go:372] Config loaded from file: /var/lib/kube-proxy/kubeconfig.conf
I0503 12:30:23.221725 2448 server_windows.go:107] Using Kernelspace Proxier.
I0503 12:30:23.221725 2448 server_windows.go:110] creating dualStackProxier for Windows kernel.
time="2021-05-03T12:30:23Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 12"
time="2021-05-03T12:30:23Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 10"
time="2021-05-03T12:30:23Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 12"
time="2021-05-03T12:30:23Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 13"
time="2021-05-03T12:30:23Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 10"
I0503 12:30:23.224600 2448 proxier.go:562] "Cleaning up old HNS policy lists"
I0503 12:30:33.229568 2448 proxier.go:583] "Hns Network loaded" hnsNetworkInfo=&{name:flannel.4096 id:ae948621-bb34-486d-b31d-cf397757b7c1 networkType:Overlay remoteSubnets:[0xc0000b77c0 0xc0000b7840 0xc0000b78c0 0xc0000b7940]}
time="2021-05-03T12:30:33Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 12"
time="2021-05-03T12:30:33Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 10"
time="2021-05-03T12:30:33Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 12"
time="2021-05-03T12:30:33Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 13"
time="2021-05-03T12:30:33Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 10"
time="2021-05-03T12:30:33Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 12"
time="2021-05-03T12:30:33Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 10"
time="2021-05-03T12:30:33Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 12"
time="2021-05-03T12:30:33Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 13"
time="2021-05-03T12:30:33Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 10"
F0503 12:30:33.256757 2448 server.go:489] unable to create proxier: unable to create ipv4 proxier: Could not find host mac address for 0.0.0.0, hostname: kube-w-002, clusterCIDR : 10.244.0.0/16, nodeIP:0.0.0.0
但我认为 Flannel 安装中已经出现了问题,因为 Flannel pod 的日志给出了以下结果:
PS C:\Users\azureuser> docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0cfa1c0c7b6d mcr.microsoft.com/oss/kubernetes/pause:1.4.1 "cmd /S /C pauseloop…" 2 hours ago Up 2 hours k8s_POD_kube-proxy-windows-j7c27_kube-system_df8fda84-cf94-4ca7-863a-9c9694f2b3ba_8
fb3ccc5e0cf7 sigwindowstools/flannel "pwsh -file /etc/kub…" 2 hours ago Up 2 hours k8s_kube-flannel_kube-flannel-ds-windows-amd64-hwbjc_kube-system_9f0aa635-200b-4902-93cc-1d1da7f49a5d_0
bc8e97427613 mcr.microsoft.com/oss/kubernetes/pause:1.4.1 "cmd /S /C pauseloop…" 2 hours ago Up 2 hours k8s_POD_kube-flannel-ds-windows-amd64-hwbjc_kube-system_9f0aa635-200b-4902-93cc-1d1da7f49a5d_0
PS C:\Users\azureuser> docker logs fb3ccc5e0cf7
Directory: C:\host\etc\cni
Mode LastWriteTime Length Name
---- ------------- ------ ----
d---- 5/3/2021 10:28 AM net.d
Directory: C:\host\etc
Mode LastWriteTime Length Name
---- ------------- ------ ----
d---- 5/3/2021 10:28 AM kube-flannel
Directory: C:\host\opt\cni
Mode LastWriteTime Length Name
---- ------------- ------ ----
d---- 5/3/2021 10:28 AM bin
Directory: C:\host\k
Mode LastWriteTime Length Name
---- ------------- ------ ----
d---- 5/3/2021 10:28 AM flannel
Directory: C:\host\k\flannel\var\run\secrets\kubernetes.io
Mode LastWriteTime Length Name
---- ------------- ------ ----
d---- 5/3/2021 10:28 AM serviceaccount
Configuring CNI for docker
WARNING: The names of some imported commands from the module 'hns' include unapproved verbs that might make them less
discoverable. To find the commands with unapproved verbs, run the Import-Module command again with the Verbose
parameter. For a list of approved verbs, type Get-Verb.
Invoke-HnsRequest : @{Error=An adapter was not found. ; ErrorCode=2151350278; Success=False}
At C:\k\flannel\hns.psm1:233 char:16
+ ... return Invoke-HnsRequest -Method POST -Type networks -Data $Json ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [Write-Error], WriteErrorException
+ FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Invoke-HNSRequest
FATA[2021-05-03T10:28:44Z] rpc error: code = Internal desc = could not create IP forward entry: The object already exists.
I0503 10:28:45.340006 5512 main.go:518] Determining IP address of default interface
I0503 10:28:47.695146 5512 main.go:531] Using interface with name Ethernet 2 and address 10.0.64.4
I0503 10:28:47.695146 5512 main.go:548] Defaulting external address to interface address (10.0.64.4)
I0503 10:28:47.767526 5512 kube.go:119] Waiting 10m0s for node controller to sync
I0503 10:28:47.769102 5512 kube.go:306] Starting kube subnet manager
I0503 10:28:48.769283 5512 kube.go:126] Node controller sync successful
I0503 10:28:48.769283 5512 main.go:246] Created subnet manager: Kubernetes Subnet Manager - kube-w-002
I0503 10:28:48.769283 5512 main.go:249] Installing signal handlers
I0503 10:28:48.769283 5512 main.go:390] Found network config - Backend type: vxlan
I0503 10:28:48.769283 5512 vxlan_windows.go:127] VXLAN config: Name=flannel.4096 MacPrefix=0E-2A VNI=4096 Port=4789 GBP=false DirectRouting=false
I0503 10:28:48.838521 5512 device_windows.go:115] Attempting to create HostComputeNetwork &{ flannel.4096 Overlay [] {[]} { [] [] []} [{Static [{10.244.4.0/24 [[123 34 84 121 112 101 34 58 34 86 83 73 68 34 44 34 83 101 116 116 105 110 103 115 34 58 123 34 73 115 111 108 97 116 105 111 110 73 100 34 58 52 48 57 54 125 125]] [{10.244.4.1 0.0.0.0/0 0}]}]}] 8 {2 0}}
E0503 10:28:49.279614 5512 streamwatcher.go:109] Unable to decode an event from the watch stream: read tcp 10.0.64.4:50315-><PUBLIC-IP>:6443: wsarecv: An established connection was aborted by the software in your host machine.
E0503 10:28:49.323566 5512 reflector.go:304] github.com/coreos/flannel/subnet/kube/kube.go:307: Failed to watch *v1.Node: Get "https://kube-lb.eastus.cloudapp.azure.com:6443/api/v1/nodes?resourceVersion=6092&timeoutSeconds=582&watch=true": dial tcp: lookup kube-lb.eastus.cloudapp.azure.com: no such host
I0503 10:28:53.739453 5512 device_windows.go:123] Waiting to get ManagementIP from HostComputeNetwork flannel.4096
I0503 10:28:54.248878 5512 device_windows.go:134] Waiting to get net interface for HostComputeNetwork flannel.4096 (10.0.64.4)
I0503 10:28:54.758966 5512 device_windows.go:148] Created HostComputeNetwork flannel.4096
I0503 10:28:54.804770 5512 main.go:313] Changing default FORWARD chain policy to ACCEPT
I0503 10:28:54.816024 5512 main.go:321] Wrote subnet file to /run/flannel/subnet.env
I0503 10:28:54.816024 5512 main.go:325] Running backend.
I0503 10:28:54.816024 5512 main.go:343] Waiting for all goroutines to exit
I0503 10:28:54.816024 5512 vxlan_network_windows.go:63] Watching for new subnet leases
谁能帮帮我吗?所以我可以在 Kubernetes 集群中使用我的 Windows 工作节点。
编辑1:
解决了 Flannel FATA-error,这个问题是由于 Flannel 无法识别网络适配器造成的。所以在我开始 Flannel 之前,我手动创建了所需的网络:
#First download HNS
PS C:\Users\azureuser> curl.exe -LO https://raw.githubusercontent.com/microsoft/SDN/master/Kubernetes/windows/hns.psm1
ipmo ./hns.psm1
#Create the network
PS C:\Users\azureuser> New-HNSNetwork -Type Overlay -AddressPrefix "192.168.255.0/30" -Gateway "192.168.255.1" -Name "External" -AdapterName "Ethernet 2" -SubnetPolicies @(@{Type = "VSID"; VSID = 9999; });
之后,您可以将 windows-node 加入集群,Flannel 将毫无问题地启动,但 Kube-proxy 问题仍然存在。