linux - CUDA/PyCUDA：哪个 GPU 正在运行 X11？

Question

在具有多个 GPU 的 Linux 系统中，您如何确定哪个 GPU 正在运行 X11，哪个完全免费运行 CUDA 内核？在具有运行 X11 的低功率 GPU 和运行内核的高功率 GPU 的系统中，这可以通过一些启发式方法来确定以使用更快的卡。但在两张相等卡的系统上，不能使用此方法。是否有 CUDA 和/或 X11 API 来确定这一点？

更新：命令“nvidia-smi -a”显示是否连接了“显示器”。我尚未确定这是否意味着物理连接、逻辑连接（运行 X11）或两者兼而有之。在此命令上运行 strace 会显示调用了许多 ioctl，并且没有调用 X11，因此假设卡报告显示器已物理连接。

score 2 · Accepted Answer

There is a device property kernelExecTimeoutEnabled in the cudaDeviceProp structure which will indicate whether the device is subject to a display watchdog timer. That is the best indicator of whether a given CUDA device is running X11 (or the windows/Mac OS equivalent).

In PyCUDA you can query the device status like this:

In [1]: from pycuda import driver as drv

In [2]: drv.init()

In [3]: print drv.Device(0).get_attribute(drv.device_attribute.KERNEL_EXEC_TIMEOUT)
1

In [4]: print drv.Device(1).get_attribute(drv.device_attribute.KERNEL_EXEC_TIMEOUT)
0

Here device 0 has a display attached, and device 1 is a dedicated compute device.

score 0 · Accepted Answer

我不知道任何可以检查的库函数。然而，一个“黑客”出现在脑海中：X11 或任何其他管理连接显示器的系统组件必须消耗一些 GPU 内存。

因此，检查两个设备是否通过“cudaGetDeviceProperties”报告相同数量的可用全局内存，然后检查“totalGlobalMem”字段的值。如果相同，请尝试在每个 GPU 上分配该（或仅略低）的内存量，并查看哪一个未能做到这一点（cudaMalloc 返回错误标志）。

前段时间我在某处（我不记得在哪里）读到，当您提高显示器分辨率时，虽然 GPU 上有一个活动的 CUDA 上下文，但该上下文可能会失效。这暗示上述建议可能有效。但是请注意，我从未真正尝试过。这只是我的疯狂猜测。

如果您设法确认它有效或无效，请告诉我们！

linux - CUDA/PyCUDA：哪个 GPU 正在运行 X11？

2 回答 2

Related

Reference