centos - 无法在 CentOS 7 上安装 Kubernetes 集群

Question

按照本指南安装 Kubernetes：

https://www.linuxtechi.com/install-kubernetes-1-7-centos7-rhel7/

进入kubeadm init步骤时，出现错误：

$ kubeadm init --skip-preflight-checks
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[init] Using Kubernetes version: v1.8.3
[init] Using Authorization modes: [Node RBAC]
[preflight] Skipping pre-flight checks
[kubeadm] WARNING: starting in 1.8, tokens expire after 24 hours by default (if you require a non-expiring token use --token-ttl 0)
[certificates] Using the existing ca certificate and key.
[certificates] Using the existing apiserver certificate and key.
[certificates] Using the existing apiserver-kubelet-client certificate and key.
[certificates] Using the existing sa key.
[certificates] Using the existing front-proxy-ca certificate and key.
[certificates] Using the existing front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Using existing up-to-date KubeConfig file: "admin.conf"
[kubeconfig] Using existing up-to-date KubeConfig file: "kubelet.conf"
[kubeconfig] Using existing up-to-date KubeConfig file: "controller-manager.conf"
[kubeconfig] Using existing up-to-date KubeConfig file: "scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests"
[init] This often takes around a minute; or longer if the control plane images have to be pulled.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by that:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
        - There is no internet connection; so the kubelet can't pull the following control plane images:
                - gcr.io/google_containers/kube-apiserver-amd64:v1.8.3
                - gcr.io/google_containers/kube-controller-manager-amd64:v1.8.3
                - gcr.io/google_containers/kube-scheduler-amd64:v1.8.3

You can troubleshoot this for example with the following commands if you're on a systemd-powered system:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'
couldn't initialize a Kubernetes cluster

检查时systemctl status kubelet：

● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: activating (auto-restart) (Result: exit-code) since Fri 2017-11-10 05:34:12 UTC; 6s ago
     Docs: http://kubernetes.io/docs/
  Process: 29927 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CGROUP_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILURE)
 Main PID: 29927 (code=exited, status=1/FAILURE)

Nov 10 05:34:12 master systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE
Nov 10 05:34:12 master systemd[1]: Unit kubelet.service entered failed state.
Nov 10 05:34:12 master systemd[1]: kubelet.service failed.

检查时journalctl -xeu kubelet：

Nov 10 05:35:15 master systemd[1]: kubelet.service holdoff time over, scheduling restart.
Nov 10 05:35:15 master systemd[1]: Started kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit kubelet.service has finished starting up.
--
-- The start-up result is done.
Nov 10 05:35:15 master systemd[1]: Starting kubelet: The Kubernetes Node Agent...
-- Subject: Unit kubelet.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit kubelet.service has begun starting up.
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.364837   30174 feature_gate.go:156] feature gates: map[]
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.364917   30174 controller.go:114] kubelet config controller: starting controller
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.364921   30174 controller.go:118] kubelet config controller: validating combination of defaults and flags
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.375149   30174 client.go:75] Connecting to docker on unix:///var/run/docker.sock
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.375226   30174 client.go:95] Start docker client with request timeout=2m0s
Nov 10 05:35:15 master kubelet[30174]: W1110 05:35:15.377200   30174 cni.go:196] Unable to update cni config: No networks found in /etc/cni/net.d
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.382890   30174 feature_gate.go:156] feature gates: map[]
Nov 10 05:35:15 master kubelet[30174]: W1110 05:35:15.383011   30174 server.go:289] --cloud-provider=auto-detect is deprecated. The desired cloud provider should be set explicitly
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.408678   30174 certificate_manager.go:361] Requesting new certificate.
Nov 10 05:35:15 master kubelet[30174]: E1110 05:35:15.409287   30174 certificate_manager.go:284] Failed while requesting a signed certificate from the master: cannot create certificate signing request: Post https://10.0.2.15:6443/apis/certificates.k8s.io/v1beta1/certifica
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.411480   30174 manager.go:149] cAdvisor running in container: "/sys/fs/cgroup/cpu,cpuacct/system.slice/kubelet.service"
Nov 10 05:35:15 master kubelet[30174]: W1110 05:35:15.425796   30174 manager.go:157] unable to connect to Rkt api service: rkt: cannot tcp Dial rkt api service: dial tcp [::1]:15441: getsockopt: connection refused
Nov 10 05:35:15 master kubelet[30174]: W1110 05:35:15.426006   30174 manager.go:166] unable to connect to CRI-O api service: Get http://%2Fvar%2Frun%2Fcrio.sock/info: dial unix /var/run/crio.sock: connect: no such file or directory
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.440364   30174 fs.go:139] Filesystem UUIDs: map[4537d533-47ff-463c-bffc-7ce294d9c93a:/dev/dm-1 598bbfb9-027e-4f52-a5b3-c4d3d1fbc2b8:/dev/dm-0 8ffa0ee9-e1a8-4c03-acce-b65b342c6935:/dev/sda2]
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.440395   30174 fs.go:140] Filesystem partitions: map[tmpfs:{mountpoint:/dev/shm major:0 minor:17 fsType:tmpfs blockSize:0} /dev/mapper/VolGroup00-LogVol00:{mountpoint:/var/lib/docker/overlay major:253 minor:0 fsType:xf
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.441589   30174 manager.go:216] Machine: {NumCores:1 CpuFrequency:3100000 MemoryCapacity:1040621568 HugePages:[{PageSize:2048 NumPages:0}] MachineID:a0b78b0170c248288e172d5196d59063 SystemUUID:A0B78B01-70C2-4828-8E17-2D
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.446544   30174 manager.go:222] Version: {KernelVersion:3.10.0-693.5.2.el7.x86_64 ContainerOsVersion:CentOS Linux 7 (Core) DockerVersion:17.09.0-ce DockerAPIVersion:1.32 CadvisorVersion: CadvisorRevision:}
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.447201   30174 server.go:422] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.451260   30174 container_manager_linux.go:252] container manager verified user specified cgroup-root exists: /
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.451293   30174 container_manager_linux.go:257] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.451403   30174 container_manager_linux.go:288] Creating device plugin handler: false
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.451616   30174 kubelet.go:273] Adding manifest file: /etc/kubernetes/manifests
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.451710   30174 kubelet.go:283] Watching apiserver
Nov 10 05:35:15 master kubelet[30174]: E1110 05:35:15.480061   30174 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:422: Failed to list *v1.Node: Get https://10.0.2.15:6443/api/v1/nodes?fieldSelector=metadata.name%3Dmaster&resourceVersion=0: dial tcp 10.0.2.15
Nov 10 05:35:15 master kubelet[30174]: E1110 05:35:15.500829   30174 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:413: Failed to list *v1.Service: Get https://10.0.2.15:6443/api/v1/services?resourceVersion=0: dial tcp 10.0.2.15:6443: getsockopt: connection r
Nov 10 05:35:15 master kubelet[30174]: E1110 05:35:15.500917   30174 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://10.0.2.15:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dmaster&resourceVersion=0: dial tcp 10.
Nov 10 05:35:15 master kubelet[30174]: W1110 05:35:15.541334   30174 kubelet_network.go:69] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.541369   30174 kubelet.go:517] Hairpin mode set to "hairpin-veth"
Nov 10 05:35:15 master kubelet[30174]: W1110 05:35:15.541616   30174 cni.go:196] Unable to update cni config: No networks found in /etc/cni/net.d
Nov 10 05:35:15 master kubelet[30174]: W1110 05:35:15.548689   30174 cni.go:196] Unable to update cni config: No networks found in /etc/cni/net.d
Nov 10 05:35:15 master kubelet[30174]: W1110 05:35:15.553143   30174 cni.go:196] Unable to update cni config: No networks found in /etc/cni/net.d
Nov 10 05:35:15 master kubelet[30174]: I1110 05:35:15.553164   30174 docker_service.go:207] Docker cri networking managed by cni
Nov 10 05:35:15 master kubelet[30174]: error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
Nov 10 05:35:15 master systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE
Nov 10 05:35:15 master systemd[1]: Unit kubelet.service entered failed state.
Nov 10 05:35:15 master systemd[1]: kubelet.service failed.

score 3 · Accepted Answer

日志中的关键点misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

确保 kubelet 使用的 cgroup 驱动与 Docker 使用的相同。

为了确保兼容性，您可以更新 Docker，或者确保将--cgroup-driverkubelet 标志设置为与 Docker 相同的值（例如 cgroupfs）

--安装 kubeadm

更新 Docker 以使用`systemd`

cat << EOF > /etc/docker/daemon.json
{
  "exec-opts": ["native.cgroupdriver=systemd"]
}
EOF

并重新启动 docker 服务。

或者更新 kubelet 以使用`cgroupfs`

sed -i -E 's/--cgroup-driver=systemd/--cgroup-driver=cgroupfs/' /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

并通过重启 kubelet systemctl restart kubelet.service。

centos - 无法在 CentOS 7 上安装 Kubernetes 集群

1 回答 1

更新 Docker 以使用systemd

或者更新 kubelet 以使用cgroupfs

Related

Reference

更新 Docker 以使用`systemd`

或者更新 kubelet 以使用`cgroupfs`