我有一个 Kubernetes onebox 部署,其中包含以下(容器化)组件,所有组件都运行为--net=host
,kubelet 作为特权 Docker 容器运行,并将 kubernetes 标志--allow-privileged
设置为 true。
gcr.io/google_containers/hyperkube-amd64:v1.7.9 "/bin/bash -c './hype" kubelet
gcr.io/google_containers/hyperkube-amd64:v1.7.9 "/bin/bash -c './hype" kube-proxy
gcr.io/google_containers/hyperkube-amd64:v1.7.9 "/bin/bash -c './hype" kube-scheduler
gcr.io/google_containers/hyperkube-amd64:v1.7.9 "/bin/bash -c './hype" kube-controller-manager
gcr.io/google_containers/hyperkube-amd64:v1.7.9 "/bin/bash -c './hype" kube-apiserver
quay.io/coreos/etcd:v3.1.0 "/usr/local/bin/etcd " etcd
最重要的是,我启用了插件管理器kubectl create -f https://github.com/kubernetes/kubernetes/blob/master/test/kubemark/resources/manifests/kube-addon-manager.yaml
,并将 calico 2.6.1 和 kube-dns 1.14.5 的默认 yaml 清单安装到/etc/kubernetes/addons/
. calico pod 按预期提供了两个节点(install-cni 和 calico-node)。
但是,kube-dns 卡在 ContainerCreating 或 ContainerCannotRun 中,在尝试启动 Kubernetes 暂停容器时出现以下错误:
{"log":"I1111 00:35:19.549318 1 manager.go:913] Added container: \"/kubepods/burstable/pod3173eef3-c678-11e7-ac4b-e41d2d59689e/1dd57d6f6c996d7abe061f6236fc8a0150cf6f95d16d5c3c462c9ed7158d3c54\" (aliases: [k8s_POD_kube-dns-v20-141138543-pmdww_kube-system_3173eef3-c678-11e7-ac4b-e41d2d59689e_0 1dd57d6f6c996d7abe061f6236fc8a0150cf6f95d16d5c3c462c9ed7158d3c54], namespace: \"docker\")\n","stream":"stderr","time":"2017-11-11T00:35:19.5526284Z"}
{"log":"I1111 00:35:19.549433 1 cni.go:291] About to add CNI network cni-loopback (type=loopback)\n","stream":"stderr","time":"2017-11-11T00:35:19.5526748Z"}
{"log":"I1111 00:35:19.549504 1 handler.go:325] Added event \u0026{/kubepods/burstable/pod3173eef3-c678-11e7-ac4b-e41d2d59689e/1dd57d6f6c996d7abe061f6236fc8a0150cf6f95d16d5c3c462c9ed7158d3c54 2017-11-11 00:35:19.3931718 +0000 UTC containerCreation {\u003cnil\u003e}}\n","stream":"stderr","time":"2017-11-11T00:35:19.5527217Z"}
{"log":"I1111 00:35:19.551134 1 container.go:407] Start housekeeping for container \"/kubepods/burstable/pod3173eef3-c678-11e7-ac4b-e41d2d59689e/1dd57d6f6c996d7abe061f6236fc8a0150cf6f95d16d5c3c462c9ed7158d3c54\"\n","stream":"stderr","time":"2017-11-11T00:35:19.5527441Z"}
{"log":"E1111 00:35:19.555099 1 cni.go:294] Error adding network: failed to Statfs \"/proc/54226/ns/net\": no such file or directory\n","stream":"stderr","time":"2017-11-11T00:35:19.5553606Z"}
{"log":"E1111 00:35:19.555122 1 cni.go:237] Error while adding to cni lo network: failed to Statfs \"/proc/54226/ns/net\": no such file or directory\n","stream":"stderr","time":"2017-11-11T00:35:19.5553887Z"}
{"log":"I1111 00:35:19.600281 1 manager.go:970] Destroyed container: \"/kubepods/burstable/pod3173eef3-c678-11e7-ac4b-e41d2d59689e/1dd57d6f6c996d7abe061f6236fc8a0150cf6f95d16d5c3c462c9ed7158d3c54\" (aliases: [k8s_POD_kube-dns-v20-141138543-pmdww_kube-system_3173eef3-c678-11e7-ac4b-e41d2d59689e_0 1dd57d6f6c996d7abe061f6236fc8a0150cf6f95d16d5c3c462c9ed7158d3c54], namespace: \"docker\")\n","stream":"stderr","time":"2017-11-11T00:35:19.6005722Z"}
我看到 \pause 容器不断出现,只是在一秒钟后退出,并带有一个无害的错误消息(这个是旧的,我停止了集群,所以它不会继续产生更多的容器):
ubuntu@r172-16-6-39:~$ docker ps -a | grep 216e39defa36
216e39defa36 gcr.io/google_containers/pause-amd64:3.0 "/pause" About an hour ago Exited (0) About an hour ago k8s_POD_kube-dns-v20-141138543-xvdmv_kube-system_0594732f-c688-11e7-9da5-e41d2d59689e_17
ubuntu@r172-16-6-39:~$ docker logs 216e39defa36
shutting down, got signal: Terminated
/proc/54226
我的主机上不存在该目录,我认为这就是 CNI 抱怨的原因。但是 Calico 的暂停容器很好,运行相同的图像,所以必须要么只在 kube-dns 的情况下无法写入,要么在 Calico 的情况下不尝试写入。我在 Openshift 上发现了一些与 SELinux 相关的类似错误的引用,但我运行的是一个裸机 Ubuntu 14.04 VM,甚至没有安装 SELinux。
ubuntu@r172-16-6-39:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.4 LTS
Release: 14.04
Codename: trusty
ubuntu@r172-16-6-39:~$ setenforce
The program 'setenforce' is currently not installed. You can install it by typing:
sudo apt-get install selinux-utils
我的 CNI conf 也很简单,由 install-cni calico 容器生成:
ubuntu@r172-16-6-39:~$ cat /etc/cni/net.d/10-calico.conf
{
"name": "k8s-pod-network",
"cniVersion": "0.1.0",
"type": "calico",
"log_level": "debug",
"datastore_type": "kubernetes",
"nodename": "172.16.6.39",
"mtu": 1500,
"ipam": {
"type": "host-local",
"subnet": "usePodCidr"
},
"policy": {
"type": "k8s",
"k8s_auth_token": "****"
},
"kubernetes": {
"k8s_api_root": "https://168.16.0.1:443",
"kubeconfig": "/etc/kubernetes/kubeconfig"
}
}
有没有人碰到类似的东西?