ceph - Rook ceph 管理器在 k3s 集群上运行不正常

Question

前段时间，我在单节点k3s集群上用rook创建了一个ceph集群，就是为了试一试，效果很好。我能够通过 cephfs 为其他 pod 提供存储空间。我按照 rook quickstart 文档中给出的示例来执行此操作。

然而，两天前，在我没有任何干预的情况下，ceph 集群停止了工作。ceph manager pod 似乎有一个问题：我的 podrook-ceph-mgr-a-6447569f69-5prdw在循环中崩溃，这是它的事件：

Events:
  Type     Reason       Age                    From                Message
  ----     ------       ----                   ----                -------
  Warning  BackOff      41m (x888 over 6h5m)   kubelet, localhost  Back-off restarting failed container
  Warning  Unhealthy    36m (x234 over 6h14m)  kubelet, localhost  Liveness probe failed: Get http://10.42.0.163:9283/: dial tcp 10.42.0.163:9283: connect: connection refused
  Warning  FailedMount  31m (x2 over 31m)      kubelet, localhost  MountVolume.SetUp failed for volume "rook-ceph-mgr-a-keyring" : failed to sync secret cache: timed out waiting for the condition
  Warning  FailedMount  31m (x2 over 31m)      kubelet, localhost  MountVolume.SetUp failed for volume "rook-ceph-mgr-token-bf88n" : failed to sync secret cache: timed out waiting for the condition
  Warning  FailedMount  31m (x2 over 31m)      kubelet, localhost  MountVolume.SetUp failed for volume "rook-config-override" : failed to sync configmap cache: timed out waiting for the condition
  Normal   Killing      28m (x2 over 30m)      kubelet, localhost  Container mgr failed liveness probe, will be restarted
  Normal   Pulled       28m (x3 over 31m)      kubelet, localhost  Container image "ceph/ceph:v14.2.7" already present on machine
  Normal   Created      28m (x3 over 31m)      kubelet, localhost  Created container mgr
  Normal   Started      28m (x3 over 31m)      kubelet, localhost  Started container mgr
  Warning  BackOff      6m47s (x50 over 22m)   kubelet, localhost  Back-off restarting failed container
  Warning  Unhealthy    63s (x28 over 30m)     kubelet, localhost  Liveness probe failed: Get http://10.42.0.163:9283/: dial tcp 10.42.0.163:9283: connect: connection refused

不知道failed to sync secret cache是原因还是结果。是车问题还是k3s问题？

没有输出k3s kubectl logs rook-ceph-mgr-a-6447569f69-5prdw -n rook-ceph（添加 -p 没有任何改变）

谢谢你的帮助，这是我关于stackoverflow的第一个问题，希望它是正确的:)

ceph - Rook ceph 管理器在 k3s 集群上运行不正常

0 回答 0

Related

Reference