1

我有一个使用 8Gi 持久块卷的 Prometheus-server pod。卷提供程序是 rook-ceph。

pod 处于 crashloopbackoff 状态,因为没有更多可用空间:

[root@node4 ~]# df -h | grep rbd

/dev/rbd0   8.0G  8.0G   36K 100% /var/lib/kubelet/plugins/ceph.rook.io/rook-ceph/mounts/pvc-80f98193-deae-11e9-a240-0025b50a01df

吊舱需要更多空间,因此我决定将卷大小调整为 20Gi。

按照文档:https ://kubernetes.io/blog/2018/07/12/resizing-persistent-volumes-using-kubernetes/

我编辑了resources.requests.storage: 20Gi持久卷声明。并升级了helm release。

现在我可以看到 PV 已调整为 20Gi。但是 PVC 仍然显示它声称 8Gi。

$ kubectl get pvc -n prometheus

NAME                      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
prometheus-alertmanager   Bound    pvc-80f5eb1a-deae-11e9-a240-0025b50a01df   2Gi        RWO            rook-ceph-block   22d
prometheus-server         Bound    pvc-80f98193-deae-11e9-a240-0025b50a01df   8Gi        RWO            rook-ceph-block   22d

$ kubectl get pv

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                STORAGECLASS      REASON   AGE
pvc-80f5eb1a-deae-11e9-a240-0025b50a01df   2Gi        RWO            Delete           Bound    prometheus/prometheus-alertmanager   rook-ceph-block            22d
pvc-80f98193-deae-11e9-a240-0025b50a01df   20Gi       RWO            Delete           Bound    prometheus/prometheus-server         rook-ceph-block            22d
pvc-fb73b383-deb2-11e9-a240-0025b50a01df   10Gi       RWO            Delete           Bound    grafana/grafana                      rook-ceph-block            22d

PVC 说明 说:

Conditions:
  Type                      Status  LastProbeTime                     LastTransitionTime                Reason  Message
  ----                      ------  -----------------                 ------------------                ------  -------
  FileSystemResizePending   True    Mon, 01 Jan 0001 00:00:00 +0000   Thu, 17 Oct 2019 15:49:05 +0530           Waiting for user to (re-)start a pod to finish file system resize of volume on node.

然后我删除了 pod 以重新启动它。

但是 pod 仍然处于 crashloopbackoff 状态。豆荚描述说:

 Warning  FailedMount  2m17s (x2 over 2m17s)  kubelet, node4     MountVolume.SetUp failed for volume "pvc-80f98193-deae-11e9-a240-0025b50a01df" : mount command failed, status: Failure, reason: Rook: Mount volume failed: failed to attach volume pvc-80f98193-deae-11e9-a240-0025b50a01df for pod prometheus/prometheus-server-756c8495ff-wtx84. Volume is already attached by pod prometheus/prometheus-server-756c8495ff-hcd85. Status Running

列出 pod 时,我只能看到新的prometheus-server-756c8495ff-wtx84(而不是旧的 pod prometheus-server-756c8495ff-hcd85):

$ kubectl get pods -n prometheus

NAME                                            READY   STATUS             RESTARTS   AGE
prometheus-alertmanager-6f756695d5-wvgr7        2/2     Running            0          22d
prometheus-kube-state-metrics-67cfbbd9d-bwx4w   1/1     Running            0          22d
prometheus-node-exporter-444bz                  1/1     Running            0          22d
prometheus-node-exporter-4hjr9                  1/1     Running            0          22d
prometheus-node-exporter-8plk7                  1/1     Running            0          22d
prometheus-node-exporter-pftf6                  1/1     Running            0          22d
prometheus-node-exporter-prndk                  1/1     Running            0          22d
prometheus-node-exporter-rchtg                  1/1     Running            0          22d
prometheus-node-exporter-xgmzs                  1/1     Running            0          22d
prometheus-pushgateway-77744d999c-5ndlm         1/1     Running            0          22d
prometheus-server-756c8495ff-wtx84              1/2     CrashLoopBackOff   5          4m31s

我该如何解决这个问题?

编辑 :

部署策略是:

StrategyType:           RollingUpdate
RollingUpdateStrategy:  1 max unavailable, 1 max surge

我可以看到即使kubectl get pv显示 pv 有 20Gi 容量,rook-ceph 的实际 rbd 块也只有 8Gi 大小:

[root@rook-ceph-operator-775cf575c5-dfpql /]# rbd info replicated-metadata-pool/pvc-80f98193-deae-11e9-a240-0025b50a01df

rbd image 'pvc-80f98193-deae-11e9-a240-0025b50a01df':
        size 8 GiB in 2048 objects
        order 22 (4 MiB objects)
        snapshot_count: 0
        id: 434b1922b4b40a
        data_pool: ec-data-pool
        block_name_prefix: rbd_data.1.434b1922b4b40a
        format: 2
        features: layering, data-pool
        op_features:
        flags:
        create_timestamp: Tue Sep 24 09:34:28 2019
        access_timestamp: Tue Sep 24 09:34:28 2019
        modify_timestamp: Tue Sep 24 09:34:28 2019

storageclass.yaml 是:

$ kubectl get sc -n prometheus -o yaml

apiVersion: v1
items:
- allowVolumeExpansion: true
  apiVersion: storage.k8s.io/v1
  kind: StorageClass
  metadata:
    creationTimestamp: "2019-08-01T11:27:31Z"
    name: rook-ceph-block
    resourceVersion: "15172025"
    selfLink: /apis/storage.k8s.io/v1/storageclasses/rook-ceph-block
    uid: 59e3b081-b44f-11e9-a240-0025b50a01df
  parameters:
    blockPool: replicated-metadata-pool
    clusterNamespace: rook-ceph
    dataBlockPool: ec-data-pool
    fstype: xfs
  provisioner: ceph.rook.io/block
  reclaimPolicy: Delete
  volumeBindingMode: Immediate
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""
4

1 回答 1

0

您可以尝试手动调整 ext4 格式的大小。这是未解决的问题(https://github.com/rook/rook/issues/3133

于 2019-10-17T10:36:04.457 回答