1

我正在使用和运行云资源上的Dask集群和笔记本服务器JupyterKubernetesHelm

我正在为集群使用一个yaml文件,最初取自https://docs.dask.org/en/latest/setup/kubernetes-helm.htmlDaskJupyter

apiVersion: v1
kind: Pod
worker:
  replicas: 2 #number of workers
  resources:
    limits:
      cpu: 2
      memory: 2G
    requests:
      cpu: 2
      memory: 2G
  env:
    - name: EXTRA_PIP_PACKAGES
      value: s3fs --upgrade
# We want to keep the same packages on the workers and jupyter environments
jupyter:
  enabled: true
  env:
    - name: EXTRA_PIP_PACKAGES
      value: s3fs --upgrade
  resources:
    limits:
      cpu: 1
      memory: 2G
    requests:
      cpu: 1
      memory: 2G

我正在使用另一个yaml文件在本地创建存储。

#CREATE A PERSISTENT VOLUME CLAIM // attached to our pod config
apiVersion: 1
kind: PersistentVolumeClaim
metadata:
 name: dask-cluster-persistent-volume-claim
spec:
 accessModes:
  - ReadWriteOne #can be used by a single node -ReadOnlyMany : for multiple nodes -ReadWriteMany: read/written to/by many nodes
 ressources:
  requests:
   storage: 2Gi # storage capacity

我想在第一个yaml文件中添加一个持久卷声明,我不知道添加volumesvolumeMounts. 如果你有想法,请分享,谢谢

4

1 回答 1

1

我首先使用YAML文件创建 pvc 声明:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pdask-cluster-persistent-volume-claim
spec:
  accessModes:
    - ReadWriteOnce #can be used by a single node -ReadOnlyMany : for multiple nodes -ReadWriteMany: read/written to/by many nodes
  resources: # https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes
    requests:
      storage: 2Gi

在 bash 中吃午餐:

kubectl apply -f Dask-Persistent-Volume-Claim.yaml
#persistentvolumeclaim/pdask-cluster-persistent-volume-claim created

我检查了持久性卷的创建:

kubectl get pv

Dask我对集群进行了重大更改YAML:我添加了volumes和从先前创建的持久卷volumeMounts中的目录读取/写入的位置,我指定为:/dataServiceTypeLoadBalancerport

apiVersion: v1
kind: Pod
scheduler:
  name: scheduler 
  enabled: true
  image:
    repository: "daskdev/dask"
    tag: 2021.8.1
    pullPolicy: IfNotPresent
  replicas: 1  #(should always be 1).
  serviceType: "LoadBalancer" # Scheduler service type. Set to `LoadBalancer` to expose outside of your cluster.
  # serviceType: "NodePort"
  # serviceType: "ClusterIP"
  #loadBalancerIP: null  # Some cloud providers allow you to specify the loadBalancerIP when using the `LoadBalancer` service type. If your cloud does not support it this option will be ignored.
  servicePort: 8786 # Scheduler service internal port.
# DASK WORKERS
worker:
  name: worker  # Dask worker name.
  image:
    repository: "daskdev/dask"  # Container image repository.
    tag: 2021.8.1  # Container image tag.
    pullPolicy: IfNotPresent  # Container image pull policy.
    dask_worker: "dask-worker"  # Dask worker command. E.g `dask-cuda-worker` for GPU worker.
  replicas: 2
  resources:
    limits:
      cpu: 2
      memory: 2G
    requests:
      cpu: 2
      memory: 2G
  mounts: # Worker Pod volumes and volume mounts, mounts.volumes follows kuberentes api v1 Volumes spec. mounts.volumeMounts follows kubernetesapi v1 VolumeMount spec
    volumes:
      - name: dask-storage
        persistentVolumeClaim:
         claimName: pvc-dask-data
    volumeMounts:
      - name: dask-storage
        mountPath: /save_data # folder for storage
  env:
    - name: EXTRA_PIP_PACKAGES
      value: s3fs --upgrade
# We want to keep the same packages on the worker and jupyter environments
jupyter:
  name: jupyter  # Jupyter name.
  enabled: true  # Enable/disable the bundled Jupyter notebook.
  #rbac: true  # Create RBAC service account and role to allow Jupyter pod to scale worker pods and access logs.
  image:
    repository: "daskdev/dask-notebook"  # Container image repository.
    tag: 2021.8.1  # Container image tag.
    pullPolicy: IfNotPresent  # Container image pull policy.
  replicas: 1  # Number of notebook servers.
  serviceType: "LoadBalancer" # Scheduler service type. Set to `LoadBalancer` to expose outside of your cluster.
  # serviceType: "NodePort"
  # serviceType: "ClusterIP"
  servicePort: 80  # Jupyter service internal port.
  # This hash corresponds to the password 'dask'
  #password: 'sha1:aae8550c0a44:9507d45e087d5ee481a5ce9f4f16f37a0867318c' # Password hash.
  env:
    - name: EXTRA_PIP_PACKAGES
      value: s3fs --upgrade
  resources:
    limits:
      cpu: 1
      memory: 2G
    requests:
      cpu: 1
      memory: 2G
  mounts: # Worker Pod volumes and volume mounts, mounts.volumes follows kuberentes api v1 Volumes spec. mounts.volumeMounts follows kubernetesapi v1 VolumeMount spec
    volumes:
      - name: dask-storage
        persistentVolumeClaim:
         claimName: pvc-dask-data
    volumeMounts:
      - name: dask-storage
        mountPath: /save_data # folder for storage

然后,我使用以下方式安装我的Dask配置helm

helm install my-config dask/dask -f values.yaml

最后,我以jupyter 交互方式访问了我的:

kubectl exec -ti [pod-name] -- /bin/bash

检查/data文件夹的存在

于 2021-08-28T08:50:30.293 回答