0

我有 1 个主节点和 5 个节点的 k8s 集群。我正在使用 ref 设置 EFK:https ://www.digitalocean.com/community/tutorials/how-to-set-up-an-elasticsearch-fluentd-and-kibana-efk-logging-stack-on-kubernetes#step -4-%E2%80%94-creating-the-fluentd-daemonset

在创建 Fluentd DaemonSet 时,1 out 5 fluentd 处于 ImagePullBackOff 状态:

kubectl get all -n kube-logging -o wide                                                                                   Tue Apr 21 03:49:26 2020

NAME         DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE       CONTAINERS   IMAGES
                   SELECTOR
ds/fluentd   5         5         4         5            4           <none>          1d        fluentd      fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-e
lasticsearch-1.1   app=fluentd
ds/fluentd   5         5         4         5            4           <none>          1d        fluentd      fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-e
lasticsearch-1.1   app=fluentd

NAME               READY     STATUS             RESTARTS   AGE       IP              NODE
po/fluentd-82h6k   1/1       Running            1          1d        100.96.15.56    ip-172-20-52-52.us-west-1.compute.internal
po/fluentd-8ghjq   0/1       ImagePullBackOff   0          17h       100.96.10.170   ip-172-20-58-72.us-west-1.compute.internal
po/fluentd-fdmc8   1/1       Running            1          1d        100.96.3.73     ip-172-20-63-147.us-west-1.compute.internal
po/fluentd-g7755   1/1       Running            1          1d        100.96.2.22     ip-172-20-60-101.us-west-1.compute.internal
po/fluentd-gj8q8   1/1       Running            1          1d        100.96.16.17    ip-172-20-57-232.us-west-1.compute.internal

admin@ip-172-20-58-79:~$ kubectl describe po/fluentd-8ghjq -n kube-logging

Events:
  Type     Reason      Age                   From                                                 Message
  ----     ------      ----                  ----                                                 -------
  Normal   BackOff     12m (x4364 over 17h)  kubelet, ip-172-20-58-72.us-west-1.compute.internal  Back-off pulling image "fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1"
  Warning  FailedSync  2m (x4612 over 17h)   kubelet, ip-172-20-58-72.us-west-1.compute.internal  Error syncing pod

Kubelet 登录无法运行 Fulentd 的节点 admin@ip-172-20-58-72:~$ journalctl -u kubelet -f

Apr 21 03:53:53 ip-172-20-58-72 kubelet[755]: E0421 03:53:53.095334     755 summary.go:92] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service"
Apr 21 03:53:53 ip-172-20-58-72 kubelet[755]: E0421 03:53:53.095369     755 summary.go:92] Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service"
Apr 21 03:53:53 ip-172-20-58-72 kubelet[755]: W0421 03:53:53.095440     755 helpers.go:847] eviction manager: no observation found for eviction signal allocatableNodeFs.available
Apr 21 03:53:54 ip-172-20-58-72 kubelet[755]: I0421 03:53:54.882213     755 server.go:779] GET /metrics/cadvisor: (50.308555ms) 200 [[Prometheus/2.12.0] 172.20.58.79:54492]
Apr 21 03:53:55 ip-172-20-58-72 kubelet[755]: I0421 03:53:55.452951     755 kuberuntime_manager.go:500] Container {Name:fluentd Image:fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1 Command:[] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[{Name:FLUENT_ELASTICSEARCH_HOST Value:vpc-cog-01-es-dtpgkfi.ap-southeast-1.es.amazonaws.com ValueFrom:nil} {Name:FLUENT_ELASTICSEARCH_PORT Value:443 ValueFrom:nil} {Name:FLUENT_ELASTICSEARCH_SCHEME Value:https ValueFrom:nil} {Name:FLUENTD_SYSTEMD_CONF Value:disable ValueFrom:nil}] Resources:{Limits:map[memory:{i:{value:536870912 scale:0} d:{Dec:<nil>} s: Format:BinarySI}] Requests:map[cpu:{i:{value:100 scale:-3} d:{Dec:<nil>} s:100m Format:DecimalSI} memory:{i:{value:209715200 scale:0} d:{Dec:<nil>} s: Format:BinarySI}]} VolumeMounts:[{Name:varlog ReadOnly:false MountPath:/var/log SubPath: MountPropagation:<nil>} {Name:varlibdockercontainers ReadOnly:true MountPath:/var/lib/docker/containers SubPath: MountPropagation:<nil>} {Name:fluentd-token-k8fnp ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath: MountPropagation:<nil>}] LivenessProbe:nil ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Apr 21 03:53:55 ip-172-20-58-72 kubelet[755]: E0421 03:53:55.455327     755 pod_workers.go:182] Error syncing pod aa65dd30-82f2-11ea-a005-0607d7cb72ed ("fluentd-8ghjq_kube-logging(aa65dd30-82f2-11ea-a005-0607d7cb72ed)"), skipping: failed to "StartContainer" for "fluentd" with ImagePullBackOff: "Back-off pulling image \"fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1\""

Kubelet 登录运行 Fulentd 成功的节点 admin@ip-172-20-63-147:~$ journalctl -u kubelet -f

Apr 21 04:09:25 ip-172-20-63-147 kubelet[1272]: E0421 04:09:25.874293    1272 summary.go:92] Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service"
Apr 21 04:09:25 ip-172-20-63-147 kubelet[1272]: E0421 04:09:25.874336    1272 summary.go:92] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service"
Apr 21 04:09:25 ip-172-20-63-147 kubelet[1272]: W0421 04:09:25.874453    1272 helpers.go:847] eviction manager: no observation found for eviction signal allocatableNodeFs.available
4

0 回答 0