0

我正在尝试按照AWS的官方文档EKS 研讨会argocd在我的 EC2 实例上运行,但它处于挂起状态,命名空间中的所有 pod 都运行良好。centos 7kube-system

下面是输出kubectl get pods --all-namespaces

NAMESPACE     NAME                                                              READY   STATUS    RESTARTS   AGE
argocd        argocd-application-controller-5785f6b79-nvg7n                     0/1     Pending   0          29s
argocd        argocd-dex-server-7f5d7d6645-gprpd                                0/1     Pending   0          19h
argocd        argocd-redis-cccbb8f7-vb44n                                       0/1     Pending   0          19h
argocd        argocd-repo-server-67ddb49495-pnw5k                               0/1     Pending   0          19h
argocd        argocd-server-6bcbf7997d-jqqrw                                    0/1     Pending   0          19h
kube-system   calico-kube-controllers-56b44cd6d5-tzgdm                          1/1     Running   0          19h
kube-system   calico-node-4z9tx                                                 1/1     Running   0          19h
kube-system   coredns-f9fd979d6-8d6hm                                           1/1     Running   0          19h
kube-system   coredns-f9fd979d6-p9dq6                                           1/1     Running   0          19h
kube-system   etcd-ip-10-1-3-94.us-east-2.compute.internal                      1/1     Running   0          19h
kube-system   kube-apiserver-ip-10-1-3-94.us-east-2.compute.internal            1/1     Running   0          19h
kube-system   kube-controller-manager-ip-10-1-3-94.us-east-2.compute.internal   1/1     Running   0          19h
kube-system   kube-proxy-tkp7k                                                  1/1     Running   0          19h
kube-system   kube-scheduler-ip-10-1-3-94.us-east-2.compute.internal            1/1     Running   0          19h

docker虽然相同的配置在我的本地 Mac 上运行良好,但我已确保kubernetes服务已启动并正在运行。尝试删除 pod,重新配置 argocd,但每次结果都保持不变。

作为新手,ArgoCD我无法弄清楚同样的原因。请让我知道我哪里出错了。谢谢!

4

1 回答 1

1

我通过运行找出了问题所在:

kubectl describe pods <name>  -n argocd

它给出了以 FailedScheduling 结尾的输出:

...
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  3m (x5 over 7m2s)  default-scheduler  0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.

此后,通过参考这个 GitHub 问题,我想出了运行:

kubectl taint nodes --all node-role.kubernetes.io/master-

执行此命令后,Pod 开始工作并从Pending状态转换为Runningkubectl describe pods输出显示为:

...
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  3m (x5 over 7m2s)  default-scheduler  0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
  Normal   Scheduled         106s               default-scheduler  Successfully assigned argocd/argocd-server-7d44dfbcc4-qfj6m to ip-XX-XX-XX-XX.<region>.compute.internal
  Normal   Pulling           105s               kubelet            Pulling image "argoproj/argocd:v1.7.6"
  Normal   Pulled            81s                kubelet            Successfully pulled image "argoproj/argocd:v1.7.6" in 23.779457251s
  Normal   Created           72s                kubelet            Created container argocd-server
  Normal   Started           72s                kubelet            Started container argocd-server

从这个错误和解决方案中,我学会了始终使用kubectl describe pods来解决错误。

于 2020-09-24T08:04:39.790 回答