以前,我在 /shared 目录上安装了一个 fsx 卷。但是,Ubuntu 18.04 + fsx 有一些错误会导致实例重新启动以卸载 fsx 卷
问问题
27 次
1 回答
0
临时解决方案:
再次挂载 fsx 卷
wget -O - https://fsx-lustre-client-repo-public-keys.s3.amazonaws.com/fsx-ubuntu-public-key.asc | sudo apt-key add -
sudo bash -c 'echo "deb https://fsx-lustre-client-repo.s3.amazonaws.com/ubuntu bionic main" > /etc/apt/sources.list.d/fsxlustreclientrepo.list'
sudo apt update -y
sudo apt install -y lustre-client-modules-$(uname -r)
sudo mount -t lustre -o noatime,flock fs-<id of the fsx>.fsx.us-east-1.amazonaws.com@tcp:/fsx /shared
ubuntu@<>:~$ ls /shared/
DeepLearningExamples checkpoint checkpoints checkpoints-1.data-00000-of-00001 checkpoints-1.index conda_tf25 conda_tf25_hvd deep-learning-models nccl_hosts
但是,更清洁的解决方案不需要在实例重新启动后重新安装 fsx 卷。
于 2021-06-08T18:58:54.430 回答