theano - 用于多个 GPU 的 theanorc

Question

我有一台带有 4 个 GPU 的 aws 机器：

00:03.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K520] (rev a1)
00:04.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K520] (rev a1)
00:05.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K520] (rev a1)
00:06.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K520] (rev a1)

我的 theanorc 文件如下所示：

[global]
floatX = float32
device = gpu0

[lib]
cnmem = 1

当我打开一个 jupyter notebook 并导入 theano 时，我得到以下信息（我假设它只使用一个 GPU）：

Using Theano backend.
Using gpu device 0: GRID K520 (CNMeM is enabled with initial size: 95.0% of memory, cuDNN 5105)
/home/sabeywardana/anaconda3/lib/python3.5/site-packages/theano/sandbox/cuda/__init__.py:600: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.

但是，如果我同时在同一台机器上打开第二个 jupyter notebook。然后我得到错误：

ERROR (theano.sandbox.cuda): ERROR: Not using GPU. Initialisation of device 0 failed:
initCnmem: cnmemInit call failed! Reason=CNMEM_STATUS_OUT_OF_MEMORY. numdev=1

ERROR (theano.sandbox.cuda): ERROR: Not using GPU. Initialisation of device gpu failed:
initCnmem: cnmemInit call failed! Reason=CNMEM_STATUS_OUT_OF_MEMORY. numdev=1

如果我手动更改我的 .theanorc 以使用 gpu1，那么第二个 jupyter 笔记本工作正常。所以问题是：有没有办法配置 .theanorc 来获得可用的 GPU？

score 2 · Accepted Answer

您可以使用device=gpu，它将选择第一个可用的 GPU。但是，在您的情况下，GPU 0 仍将被视为“可用”（它没有多少内存剩余，但仍然可以执行）。您可以使用nvidia-smi将 GPU 的计算模式设置为“独占线程”，以便第一个笔记本“阻塞”第一个 GPU 以独占使用，而第二个笔记本将使用另一个。

另一种选择是在导入 theano 之前从笔记本内部更改 THEANO_FLAGS 环境变量。就像是：

import os

os.environ['THEANO_FLAGS'] = os.environ.get('THEANO_FLAGS', '') + ',' + 'device=gpu1'

import theano

score 1 · Accepted Answer

导入 theano 后无法更改 gpu 设备。

也许你可以试试这个-

import os
os.system("THEANO_FLAGS='device=gpu0' python script_1.py")
os.system("THEANO_FLAGS='device=gpu1' python script_2.py")
os.system("THEANO_FLAGS='device=gpu1' python script_3.py")
os.system("THEANO_FLAGS='device=gpu1' python script_4.py")

如果您想从笔记本内部执行此操作（更多程序化），您可以使用以下代码段：-

import theano.sandbox.cuda
theano.sandbox.cuda.use("gpu0")

将此粘贴到每个笔记本并更改 gpu id。它会起作用的。

theano - 用于多个 GPU 的 theanorc

2 回答 2

Related

Reference