0

CUDA 运行时有一个“当前设备”的概念,而 CUDA 驱动程序没有。相反,驱动程序有一个上下文堆栈,其中“当前上下文”位于堆栈的顶部。

两者如何互动?即 Driver API 调用如何影响 Runtime API 的当前设备,改变当前设备如何影响 Driver API 的上下文堆栈或其他状态?

有点相关的问题:如何将 cuda 驱动程序 api 与 cuda 运行时 api 混合?

4

1 回答 1

1

运行时当前设备 -> 驱动程序上下文堆栈

如果您设置当前设备(使用cudaSetDevice()),则所选设备的主要上下文将放置在堆栈的顶部。

  • 如果堆栈为,则将其压入堆栈。
  • 如果堆栈是非空的,它会替换堆栈的顶部。

驱动程序上下文堆栈 -> 运行时当前设备

(这部分我不是 100% 确定的,所以用一粒盐来处理。)

运行时会将当前设备报告为当前上下文的设备 - 无论它是否是主上下文。

如果上下文堆栈为空,则 Runtime 的当前设备将报告为 0。

一个程序来说明这种行为:

#include <cuda/api.hpp>
#include <iostream>

void report_current_device()
{
    std::cout << "Runtime believes the current device index is: "
        << cuda::device::current::detail_::get_id() << '\n';
}

int main()
{
    namespace context = cuda::context::detail_;
    namespace cur_dev = cuda::device::current::detail_;
    namespace pc = cuda::device::primary_context::detail_;
    namespace cur_ctx = cuda::context::current::detail_;
    using std::cout;

    cuda::device::id_t dev_idx[2];
    cuda::context::handle_t pc_handle[2];
    
    cuda::initialize_driver();
    dev_idx[0] = cur_dev::get_id();
    report_current_device();
    dev_idx[1] = (dev_idx[0] == 0) ? 1 : 0;
    pc_handle[0] = pc::obtain_and_increase_refcount(dev_idx[0]);
    cout << "Obtained primary context handle for device " << dev_idx[0]<< '\n';
    pc_handle[1] = pc::obtain_and_increase_refcount(dev_idx[1]);
    cout << "Obtained primary context handle for device " << dev_idx[1]<< '\n';
    report_current_device();
    cur_ctx::push(pc_handle[1]);
    cout << "Pushed primary context handle for device " << dev_idx[1] << " onto the stack\n";
    report_current_device();
    auto ctx = context::create_and_push(dev_idx[0]);
    cout << "Created a new context for device " << dev_idx[0] << " and pushed it onto the stack\n";
    report_current_device();
    cur_ctx::push(ctx);
    cout << "Pushed primary context handle for device " << dev_idx[0] << " onto the stack\n";
    report_current_device();
    cur_ctx::push(pc_handle[1]);
    cout << "Pushed primary context for device " << dev_idx[1] << " onto the stack\n";
    report_current_device();
    pc::decrease_refcount(dev_idx[1]);
    cout << "Deactivated/destroyed primary context for device " << dev_idx[1] << '\n';
    report_current_device();
}

...这导致:

Runtime believes the current device index is: 0
Obtained primary context handle for device 0
Obtained primary context handle for device 1
Runtime believes the current device index is: 0
Pushed primary context handle for device 1 onto the stack
Runtime believes the current device index is: 1
Created a new context for device 0 and pushed it onto the stack
Runtime believes the current device index is: 0
Pushed primary context handle for device 0 onto the stack
Runtime believes the current device index is: 0
Pushed primary context for device 1 onto the stack
Runtime believes the current device index is: 1
Deactivated/destroyed primary context for device 1
Runtime believes the current device index is: 1

该程序使用我的这个库

于 2021-12-01T19:01:58.833 回答