我有一个 5 节点 Kubernetes 集群,其中 1 个是主节点(使用 kubeadm 设置)。当我第一次部署主节点时,我还部署了 kubernetes 仪表板,因此它在同一台机器上运行。之后,我将其他节点加入了集群。
现在,当我使用 YAML 文件部署 pod 时,它保持在该ContainerCreating
状态。所以我描述了 pod 并看到了部署它的机器。我在机器上 SSH 并首先检查docker ps -a
我可以确定图像没有启动,甚至没有被拉取。所以我查看了 kubelet 日志(我没有复制所有内容,但这会给出一个很好的主意):
E0131 11:05:40.486422 2873 server.go:459] Kubelet needs to run as uid `0`. It is being run as 1000
W0131 11:05:40.486616 2873 server.go:469] write /proc/self/oom_score_adj: permission denied
W0131 11:05:40.486978 2873 server.go:669] No api server defined - no events will be sent to API server.
W0131 11:05:40.491423 2873 kubelet_network.go:69] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
I0131 11:05:40.491498 2873 kubelet.go:477] Hairpin mode set to "hairpin-veth"
W0131 11:05:40.495353 2873 plugins.go:210] can't set sysctl net/bridge/bridge-nf-call-iptables: open /proc/sys/net/bridge/bridge-nf-call-iptables: permission denied
I0131 11:05:40.503259 2873 docker_manager.go:257] Setting dockerRoot to /var/lib/docker
I0131 11:05:40.503308 2873 docker_manager.go:260] Setting cgroupDriver to cgroupfs
I0131 11:05:40.506028 2873 server.go:770] Started kubelet v1.5.2
E0131 11:05:40.506209 2873 server.go:481] Starting health server failed: listen tcp 127.0.0.1:10248: bind: address already in use
E0131 11:05:40.506300 2873 kubelet.go:1145] Image garbage collection failed: unable to find data for container /
I0131 11:05:40.506413 2873 server.go:123] Starting to listen on 0.0.0.0:10250
W0131 11:05:40.506445 2873 kubelet.go:1224] No api server defined - no node status update will be sent.
E0131 11:05:40.507209 2873 kubelet.go:1228] error creating pods directory: mkdir /var/lib/kubelet/pods: permission denied
I0131 11:05:40.509613 2873 status_manager.go:125] Kubernetes client is nil, not starting status manager.
I0131 11:05:40.509656 2873 kubelet.go:1714] Starting kubelet main sync loop.
I0131 11:05:40.509710 2873 kubelet.go:1725] skipping pod synchronization - [error creating pods directory: mkdir /var/lib/kubelet/pods: permission denied container runtime is down]
F0131 11:05:40.509522 2873 server.go:148] listen tcp 0.0.0.0:10255: bind: address already in use
有很多权限问题。我不知道如何解决这个问题。我已将 root 和用户帐户添加到 docker 组以查看它是否修复它,但它没有。
更新
上面我做了一个kubelet logs
,这就是你收到 uid 消息的原因。当我执行时,sudo kubelet logs
我得到这些结果:
I0201 15:36:01.386564 5082 feature_gate.go:181] feature gates: map[]
W0201 15:36:01.386861 5082 server.go:400] No API client: no api servers specified
I0201 15:36:01.386953 5082 docker.go:356] Connecting to docker on unix:///var/run/docker.sock
I0201 15:36:01.386991 5082 docker.go:376] Start docker client with request timeout=2m0s
I0201 15:36:01.401737 5082 manager.go:143] cAdvisor running in container: "/user.slice"
W0201 15:36:01.415664 5082 manager.go:151] unable to connect to Rkt api service: rkt: cannot tcp Dial rkt api service: dial tcp [::1]:15441: getsockopt: connection refused
I0201 15:36:01.431725 5082 fs.go:117] Filesystem partitions: map[/dev/mmcblk0p2:{mountpoint:/var/lib/docker/aufs major:179 minor:2 fsType:ext4 blockSize:0}]
I0201 15:36:01.440439 5082 manager.go:198] Machine: {NumCores:4 CpuFrequency:1920000 MemoryCapacity:3519315968 MachineID:a9807123b38d1f069a44f0b7588b5884 SystemUUID:03000200-0400-0500-0006-000700080009 BootID:7e71fe9b-a9d8-4921-80c7-9d09e49ed1ef Filesystems:[{Device:/dev/mmcblk0p2 Capacity:57295605760 Type:vfs Inodes:3563520 HasInodes:true}] DiskMap:map[179:0:{Name:mmcblk0 Major:179 Minor:0 Size:62545461248 Scheduler:deadline} 179:8:{Name:mmcblk0boot0 Major:179 Minor:8 Size:4194304 Scheduler:deadline} 179:16:{Name:mmcblk0boot1 Major:179 Minor:16 Size:4194304 Scheduler:deadline} 179:24:{Name:mmcblk0rpmb Major:179 Minor:24 Size:4194304 Scheduler:deadline}] NetworkDevices:[{Name:datapath MacAddress:72:36:99:b2:ba:be Speed:0 Mtu:1410} {Name:dummy0 MacAddress:ea:c7:5e:6d:29:75 Speed:0 Mtu:1500} {Name:enp1s0 MacAddress:00:07:32:3e:17:8c Speed:1000 Mtu:1500} {Name:vxlan-6784 MacAddress:5a:81:bb:f6:00:d7 Speed:0 Mtu:1500} {Name:weave MacAddress:92:64:f5:c5:57:a7 Speed:0 Mtu:1410}] Topology:[{Id:0 Memory:3519315968 Cores:[{Id:0 Threads:[0] Caches:[{Size:24576 Type:Data Level:1} {Size:32768 Type:Instruction Level:1}]} {Id:1 Threads:[1] Caches:[{Size:24576 Type:Data Level:1} {Size:32768 Type:Instruction Level:1}]} {Id:2 Threads:[2] Caches:[{Size:24576 Type:Data Level:1} {Size:32768 Type:Instruction Level:1}]} {Id:3 Threads:[3] Caches:[{Size:24576 Type:Data Level:1} {Size:32768 Type:Instruction Level:1}]}] Caches:[]}] CloudProvider:Unknown InstanceType:Unknown InstanceID:None}
I0201 15:36:01.442170 5082 manager.go:204] Version: {KernelVersion:4.4.0-31-generic ContainerOsVersion:Ubuntu 16.04.1 LTS DockerVersion:1.12.3 CadvisorVersion: CadvisorRevision:}
I0201 15:36:01.444559 5082 cadvisor_linux.go:152] Failed to register cAdvisor on port 4194, retrying. Error: listen tcp :4194: bind: address already in use
W0201 15:36:01.449146 5082 container_manager_linux.go:205] Running with swap on is not supported, please disable swap! This will be a fatal error by default starting in K8s v1.6! In the meantime, you can opt-in to making this a fatal error by enabling --experimental-fail-swap-on.
W0201 15:36:01.449653 5082 server.go:669] No api server defined - no events will be sent to API server.
W0201 15:36:01.457574 5082 kubelet_network.go:69] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
I0201 15:36:01.457658 5082 kubelet.go:477] Hairpin mode set to "hairpin-veth"
I0201 15:36:01.471512 5082 docker_manager.go:257] Setting dockerRoot to /var/lib/docker
I0201 15:36:01.471570 5082 docker_manager.go:260] Setting cgroupDriver to cgroupfs
I0201 15:36:01.474678 5082 server.go:770] Started kubelet v1.5.2
E0201 15:36:01.474926 5082 server.go:481] Starting health server failed: listen tcp 127.0.0.1:10248: bind: address already in use
E0201 15:36:01.475062 5082 kubelet.go:1145] Image garbage collection failed: unable to find data for container /
W0201 15:36:01.475208 5082 kubelet.go:1224] No api server defined - no node status update will be sent.
I0201 15:36:01.475702 5082 kubelet_node_status.go:204] Setting node annotation to enable volume controller attach/detach
I0201 15:36:01.479587 5082 server.go:123] Starting to listen on 0.0.0.0:10250
F0201 15:36:01.481605 5082 server.go:148] listen tcp 0.0.0.0:10255: bind: address already in use