2

看起来问题是因为 CNI (calico) 但不确定 ICP 中的修复方法是什么(请参阅下面的 journalctl -u kubelet 日志)

ICP 安装程序日志:

FAILED! => {"attempts": 100, "changed": true, "cmd": "kubectl -n kube-system get daemonset kube-dns -o=custom-columns=A:.status.numberAvailable,B:.status.desiredNumberScheduled --no-headers=true | tr -s \" \" | awk '$1 == $2 {print \"READY\"}'", "delta": "0:00:00.403489", "end": "2018-07-08 09:04:21.922839", "rc": 0, "start": "2018-07-08 09:04:21.519350", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

journalctl -u kubelet

Jul 08 22:40:38 dev-master hyperkube[2763]: E0708 22:40:38.548157    2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: nodes is forbidden: User "kubelet" cannot list nodes at the cluster scope
Jul 08 22:40:38 dev-master hyperkube[2763]: E0708 22:40:38.549872    2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: pods is forbidden: User "kubelet" cannot list pods at the cluster scope
Jul 08 22:40:38 dev-master hyperkube[2763]: E0708 22:40:38.555379    2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: services is forbidden: User "kubelet" cannot list services at the cluster scope
Jul 08 22:40:38 dev-master hyperkube[2763]: E0708 22:40:38.738411    2763 event.go:200] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"k8s-master-10.50.50.201.153f85e7528e5906", GenerateName:"", Namespace:"kube-system", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, InvolvedObject:v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"k8s-master-10.50.50.201", UID:"b0ed63e50c3259666286e5a788d12b81", APIVersion:"v1", ResourceVersion:"", FieldPath:"spec.containers{scheduler}"}, Reason:"Started", Message:"Started container", Source:v1.EventSource{Component:"kubelet", Host:"10.50.50.201"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbec8c296b58a5506, ext:106413065445, loc:(*time.Location)(0xb58e300)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbec8c296b58a5506, ext:106413065445, loc:(*time.Location)(0xb58e300)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events is forbidden: User "kubelet" cannot create events in the namespace "kube-system"' (will not retry!)

Jul 08 22:40:43 dev-master hyperkube[2763]: E0708 22:40:43.938806    2763 kubelet.go:2130] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
    Jul 08 22:40:44 dev-master hyperkube[2763]: E0708 22:40:44.556337    2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: nodes is forbidden: User "kubelet" cannot list nodes at the cluster scope
    Jul 08 22:40:44 dev-master hyperkube[2763]: E0708 22:40:44.557513    2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: pods is forbidden: User "kubelet" cannot list pods at the cluster scope
    Jul 08 22:40:44 dev-master hyperkube[2763]: E0708 22:40:44.561007    2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: services is forbidden: User "kubelet" cannot list services at the cluster scope
    Jul 08 22:40:45 dev-master hyperkube[2763]: E0708 22:40:45.557584    2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: nodes is forbidden: User "kubelet" cannot list nodes at the cluster scope
    Jul 08 22:40:45 dev-master hyperkube[2763]: E0708 22:40:45.558375    2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: pods is forbidden: User "kubelet" cannot list pods at the cluster scope
    Jul 08 22:40:45 dev-master hyperkube[2763]: E0708 22:40:45.561807    2763 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: services is forbidden: User "kubelet" cannot list services at the cluster scope
    Jul 08 22:40:46 dev-master hyperkube[2763]: I0708 22:40:46.393905    2763 kubelet_node_status.go:289] Setting node annotation to enable volume controller attach/detach
    Jul 08 22:40:46 dev-master hyperkube[2763]: I0708 22:40:46.396261    2763 kubelet_node_status.go:83] Attempting to register node 10.50.50.201
    Jul 08 22:40:46 dev-master hyperkube[2763]: E0708 22:40:46.397540    2763 kubelet_node_status.go:107] Unable to register node "10.50.50.201" with API server: nodes is forbidden: User "kubelet" cannot create nodes at the cluster scope

Jul 08 19:43:48 dev-master hyperkube[9689]: E0708 19:43:48.161949    9689 cni.go:259] Error adding network: no configured Calico pools
Jul 08 19:43:48 dev-master hyperkube[9689]: E0708 19:43:48.161980    9689 cni.go:227] Error while adding to cni network: no configured Calico pools
Jul 08 19:43:48 dev-master hyperkube[9689]: E0708 19:43:48.468392    9689 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "kube-dns-splct_kube-system" network: no configured Calico
Jul 08 19:43:48 dev-master hyperkube[9689]: E0708 19:43:48.468455    9689 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "kube-dns-splct_kube-system(113e64b2-82e6-11e8-83bb-0242a9e42805)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up 
Jul 08 19:43:48 dev-master hyperkube[9689]: E0708 19:43:48.468479    9689 kuberuntime_manager.go:646] createPodSandbox for pod "kube-dns-splct_kube-system(113e64b2-82e6-11e8-83bb-0242a9e42805)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up
Jul 08 19:43:48 dev-master hyperkube[9689]: E0708 19:43:48.468556    9689 pod_workers.go:186] Error syncing pod 113e64b2-82e6-11e8-83bb-0242a9e42805 ("kube-dns-splct_kube-system(113e64b2-82e6-11e8-83bb-0242a9e42805)"), skipping: failed to "CreatePodSandbox" for "kube-d
Jul 08 19:43:48 dev-master hyperkube[9689]: I0708 19:43:48.938222    9689 kuberuntime_manager.go:513] Container {Name:calico-node Image:ibmcom/calico-node:v3.0.4 Command:[] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[{Name:ETCD_ENDPOINTS Value: ValueFrom:&EnvVarSource
Jul 08 19:43:48 dev-master hyperkube[9689]: e:FELIX_HEALTHENABLED Value:true ValueFrom:nil} {Name:IP_AUTODETECTION_METHOD Value:can-reach=10.50.50.201 ValueFrom:nil}] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[{Name:lib-modules ReadOnly:true MountPath:/lib/m
Jul 08 19:43:48 dev-master hyperkube[9689]: I0708 19:43:48.938449    9689 kuberuntime_manager.go:757] checking backoff for container "calico-node" in pod "calico-node-wpln7_kube-system(10107b3e-82e6-11e8-83bb-0242a9e42805)"
Jul 08 19:43:48 dev-master hyperkube[9689]: I0708 19:43:48.938699    9689 kuberuntime_manager.go:767] Back-off 5m0s restarting failed container=calico-node pod=calico-node-wpln7_kube-system(10107b3e-82e6-11e8-83bb-0242a9e42805)
Jul 08 19:43:48 dev-master hyperkube[9689]: E0708 19:43:48.938735    9689 pod_workers.go:186] Error syncing pod 10107b3e-82e6-11e8-83bb-0242a9e42805 ("calico-node-wpln7_kube-system(10107b3e-82e6-11e8-83bb-0242a9e42805)"), skipping: failed to "StartContainer" for "calic
lines 4918-4962/4962 (END)

docker ps (master node) : Container-> k8s_POD_kube-dns-splct_kube-system-* 反复崩溃。

CONTAINER ID        IMAGE                            COMMAND                  CREATED             STATUS                  PORTS               NAMES
ed24d636fdd1        ibmcom/pause:3.0                 "/pause"                 1 second ago        Up Less than a second                       k8s_POD_kube-dns-splct_kube-system_113e64b2-82e6-11e8-83bb-0242a9e42805_121
49b648837900        ibmcom/calico-cni                "/install-cni.sh"        5 minutes ago       Up 5 minutes                                k8s_install-cni_calico-node-wpln7_kube-system_10107b3e-82e6-11e8-83bb-0242a9e42805_0
933ff30177de        ibmcom/calico-kube-controllers   "/usr/bin/kube-contr…"   5 minutes ago       Up 5 minutes                                k8s_calico-kube-controllers_calico-kube-controllers-759f7fc556-mm5tg_kube-system_1010712e-82e6-11e8-83bb-0242a9e42805_0
12e9262299af        ibmcom/pause:3.0                 "/pause"                 6 minutes ago       Up 5 minutes                                k8s_POD_calico-kube-controllers-759f7fc556-mm5tg_kube-system_1010712e-82e6-11e8-83bb-0242a9e42805_0
8dcb2b2b3eb5        ibmcom/pause:3.0                 "/pause"                 6 minutes ago       Up 5 minutes                                k8s_POD_calico-node-wpln7_kube-system_10107b3e-82e6-11e8-83bb-0242a9e42805_0
9486ff78df31        ibmcom/tiller                    "/tiller"                6 minutes ago       Up 6 minutes                                k8s_tiller_tiller-deploy-c59888d97-7nwph_kube-system_016019ab-82e6-11e8-83bb-0242a9e42805_0
e5588f68af1b        ibmcom/pause:3.0                 "/pause"                 6 minutes ago       Up 6 minutes                                k8s_POD_tiller-deploy-c59888d97-7nwph_kube-system_016019ab-82e6-11e8-83bb-0242a9e42805_0
e80460d857ff        ibmcom/icp-image-manager         "/icp-image-manager …"   10 minutes ago      Up 10 minutes                               k8s_image-manager_image-manager-0_kube-system_7b7554ce-82e5-11e8-83bb-0242a9e42805_0
e207175f19b7        ibmcom/registry                  "/entrypoint.sh /etc…"   10 minutes ago      Up 10 minutes                               k8s_icp-registry_image-manager-0_kube-system_7b7554ce-82e5-11e8-83bb-0242a9e42805_0
477faf0668f3        ibmcom/pause:3.0                 "/pause"                 10 minutes ago      Up 10 minutes                               k8s_POD_image-manager-0_kube-system_7b7554ce-82e5-11e8-83bb-0242a9e42805_0
8996bb8c37b7        d4b6454d4873                     "/hyperkube schedule…"   10 minutes ago      Up 10 minutes                               k8s_scheduler_k8s-master-10.50.50.201_kube-system_9e5bce1f08c050be21fa6380e4e363cc_0
835ee941432c        d4b6454d4873                     "/hyperkube apiserve…"   10 minutes ago      Up 10 minutes                               k8s_apiserver_k8s-master-10.50.50.201_kube-system_9e5bce1f08c050be21fa6380e4e363cc_0
de409ff63cb2        d4b6454d4873                     "/hyperkube controll…"   10 minutes ago      Up 10 minutes                               k8s_controller-manager_k8s-master-10.50.50.201_kube-system_9e5bce1f08c050be21fa6380e4e363cc_0
716032a308ea        ibmcom/pause:3.0                 "/pause"                 10 minutes ago      Up 10 minutes                               k8s_POD_k8s-master-10.50.50.201_kube-system_9e5bce1f08c050be21fa6380e4e363cc_0
bd9e64e3d6a2        d4b6454d4873                     "/hyperkube proxy --…"   12 minutes ago      Up 12 minutes                               k8s_proxy_k8s-proxy-10.50.50.201_kube-system_3e068267cfe8f990cd2c9a4635be044d_0
bab3c9ef7e40        ibmcom/pause:3.0                 "/pause"                 12 minutes ago      Up 12 minutes                               k8s_POD_k8s-proxy-10.50.50.201_kube-system_3e068267cfe8f990cd2c9a4635be044d_0

Kubectl(主节点):我相信此时 kube 应该已经初始化并运行,但似乎不是。

kubectl get pods -s 127.0.0.1:8888 --all-namespaces
The connection to the server 127.0.0.1:8888 was refused - did you specify the right host or port?

以下是我尝试过的选项:

  1. 创建启用和禁用 IP_IP 的集群。由于所有节点都在同一个子网上,IP_IP 设置应该不会产生影响。
  2. Etcd 在单独的节点上运行并作为主节点的一部分
  3. ifconfig tunl0 在上述所有场景中返回以下(即没有 IP 分配):

    tunl0链路封装:IPIP 隧道 HWaddr

        NOARP  MTU:1480  Metric:1
        RX packets:0 errors:0 dropped:0 overruns:0 frame:0
        TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000
        RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
    
  4. “calicoctl get profile”返回空,“calicoctl get nodes”也是如此,我相信这是因为,calico 尚未配置。

其他检查、想法和选择?

Calico Kube 控制器日志(重复):

2018-07-09 05:46:08.440 [WARNING][1] cache.go 278: Value for key has changed, queueing update to reprogram key="kns.default" type=v3.Profile
2018-07-09 05:46:08.440 [WARNING][1] cache.go 278: Value for key has changed, queueing update to reprogram key="kns.kube-public" type=v3.Profile
2018-07-09 05:46:08.440 [WARNING][1] cache.go 278: Value for key has changed, queueing update to reprogram key="kns.kube-system" type=v3.Profile
2018-07-09 05:46:08.440 [INFO][1] namespace_controller.go 223: Create/Update Profile in Calico datastore key="kns.default"
2018-07-09 05:46:08.441 [INFO][1] namespace_controller.go 246: Update Profile in Calico datastore with resource version  key="kns.default"
2018-07-09 05:46:08.442 [INFO][1] namespace_controller.go 252: Successfully updated profile key="kns.default"
2018-07-09 05:46:08.442 [INFO][1] namespace_controller.go 223: Create/Update Profile in Calico datastore key="kns.kube-public"
2018-07-09 05:46:08.446 [INFO][1] namespace_controller.go 246: Update Profile in Calico datastore with resource version  key="kns.kube-public"
2018-07-09 05:46:08.447 [INFO][1] namespace_controller.go 252: Successfully updated profile key="kns.kube-public"
2018-07-09 05:46:08.447 [INFO][1] namespace_controller.go 223: Create/Update Profile in Calico datastore key="kns.kube-system"
2018-07-09 05:46:08.465 [INFO][1] namespace_controller.go 246: Update Profile in Calico datastore with resource version  key="kns.kube-system"
2018-07-09 05:46:08.476 [INFO][1] namespace_controller.go 252: Successfully updated profile key="kns.kube-system"
4

3 回答 3

1

首先,从 ICP 2.1.0.3 开始,K8S apiserver 的不安全端口 8888 被禁用,所以你不能使用这个不安全的端口与 Kubenetes 通信。

对于这个问题,您能否让我知道以下信息或输出。

  • 您环境中的网络配置。-> ifconfig -a
  • 路由表:-> 路由
  • /ect/hosts 文件的内容:-> cat /etc/hosts
  • ICP 安装配置文件。-> config.yaml 和主机
于 2018-07-09T03:09:40.217 回答
0

我有同样的经历,至少同样的高级错误信息:

“失败 - 重试:等待 kube-dns 启动”。

我必须做两件事才能通过这个安装步骤:

  • 将主机名(以及我的 /etc/hosts 中的条目)更改为小写。印花布不喜欢大写
  • 注释 /etc/hosts 中的 localhost 条目(#127.0.0.1 localhost.localdomain localhost)

完成后,安装完成正常。

于 2018-08-15T09:48:14.413 回答
0

Well issue seemed to be with the Docker storage driver (btrfs) that I was using. Once I switched to 'Overlay', I was able to move forward.

于 2018-08-13T18:26:58.867 回答