我正在尝试复制工作/实验,这些工作/实验需要我遵循这个关于设置 Jupyter + Tensorflow + Nvidia GPU + Docker + Google Compute Engine 的特定教程。'
我能够成功安装nvidia-docker
. 但是,在教程中的部分下Verify the GPU is Visible from a Docker Container
,当我尝试运行时
sudo nvidia-docker-plugin
我收到以下错误(见最后一行):
nvidia-docker-plugin | 2019/04/23 15:17:47 Loading NVIDIA unified memory
nvidia-docker-plugin | 2019/04/23 15:17:47 Loading NVIDIA management library
nvidia-docker-plugin | 2019/04/23 15:17:47 Discovering GPU devices
nvidia-docker-plugin | 2019/04/23 15:17:47 Provisioning volumes at /var/lib/nvidia-docker/volumes
nvidia-docker-plugin | 2019/04/23 15:17:47 Serving plugin API at /run/docker/plugins
nvidia-docker-plugin | 2019/04/23 15:17:47 Serving remote API at localhost:3476
nvidia-docker-plugin | 2019/04/23 15:17:47 Error: listen tcp 127.0.0.1:3476: bind: address already in use
当我跑步时
sudo nvidia-docker run --rm nvidia/cuda nvidia-smi
我碰巧收到以下executable file not found in $PATH": unknown
错误:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "exec: \"nvidia-smi\": executable file not found in $PATH": unknown.
ERRO[0000] error waiting for container: context canceled
我对 docker 很陌生;因此,如果有人可以帮助我完成解决方案,那就太好了。我试过寻找答案,但解决问题的实际过程却让我回避了。任何帮助将不胜感激。
编辑:我按照教程中的说明设置了 GCE 实例(即 Ubuntu 16.04 LTS,50GB 引导磁盘,1 个 GPU,带有 jupyter 和 tensorboard)