CUDA 运行时有一个“当前设备”的概念,而 CUDA 驱动程序没有。相反,驱动程序有一个上下文堆栈,其中“当前上下文”位于堆栈的顶部。
两者如何互动?即 Driver API 调用如何影响 Runtime API 的当前设备,改变当前设备如何影响 Driver API 的上下文堆栈或其他状态?
CUDA 运行时有一个“当前设备”的概念,而 CUDA 驱动程序没有。相反,驱动程序有一个上下文堆栈,其中“当前上下文”位于堆栈的顶部。
两者如何互动?即 Driver API 调用如何影响 Runtime API 的当前设备,改变当前设备如何影响 Driver API 的上下文堆栈或其他状态?
如果您设置当前设备(使用cudaSetDevice()
),则所选设备的主要上下文将放置在堆栈的顶部。
(这部分我不是 100% 确定的,所以用一粒盐来处理。)
运行时会将当前设备报告为当前上下文的设备 - 无论它是否是主上下文。
如果上下文堆栈为空,则 Runtime 的当前设备将报告为 0。
一个程序来说明这种行为:
#include <cuda/api.hpp>
#include <iostream>
void report_current_device()
{
std::cout << "Runtime believes the current device index is: "
<< cuda::device::current::detail_::get_id() << '\n';
}
int main()
{
namespace context = cuda::context::detail_;
namespace cur_dev = cuda::device::current::detail_;
namespace pc = cuda::device::primary_context::detail_;
namespace cur_ctx = cuda::context::current::detail_;
using std::cout;
cuda::device::id_t dev_idx[2];
cuda::context::handle_t pc_handle[2];
cuda::initialize_driver();
dev_idx[0] = cur_dev::get_id();
report_current_device();
dev_idx[1] = (dev_idx[0] == 0) ? 1 : 0;
pc_handle[0] = pc::obtain_and_increase_refcount(dev_idx[0]);
cout << "Obtained primary context handle for device " << dev_idx[0]<< '\n';
pc_handle[1] = pc::obtain_and_increase_refcount(dev_idx[1]);
cout << "Obtained primary context handle for device " << dev_idx[1]<< '\n';
report_current_device();
cur_ctx::push(pc_handle[1]);
cout << "Pushed primary context handle for device " << dev_idx[1] << " onto the stack\n";
report_current_device();
auto ctx = context::create_and_push(dev_idx[0]);
cout << "Created a new context for device " << dev_idx[0] << " and pushed it onto the stack\n";
report_current_device();
cur_ctx::push(ctx);
cout << "Pushed primary context handle for device " << dev_idx[0] << " onto the stack\n";
report_current_device();
cur_ctx::push(pc_handle[1]);
cout << "Pushed primary context for device " << dev_idx[1] << " onto the stack\n";
report_current_device();
pc::decrease_refcount(dev_idx[1]);
cout << "Deactivated/destroyed primary context for device " << dev_idx[1] << '\n';
report_current_device();
}
...这导致:
Runtime believes the current device index is: 0
Obtained primary context handle for device 0
Obtained primary context handle for device 1
Runtime believes the current device index is: 0
Pushed primary context handle for device 1 onto the stack
Runtime believes the current device index is: 1
Created a new context for device 0 and pushed it onto the stack
Runtime believes the current device index is: 0
Pushed primary context handle for device 0 onto the stack
Runtime believes the current device index is: 0
Pushed primary context for device 1 onto the stack
Runtime believes the current device index is: 1
Deactivated/destroyed primary context for device 1
Runtime believes the current device index is: 1
该程序使用我的这个库。