python - 使用 pycuda (lerp) 进行线性插值

Question

我是一个刚刚进入 pyCUDA 的休闲 Python 达人。我试图弄清楚如何使用 pyCUDA 实现线性插值（lerp）。CUDA CG函数为：http ://http.developer.nvidia.com/Cg/lerp.html

我的最终目标是从一组加权随机点在 pycuda 中进行双线性插值。我从来没有为此编写过 C 或 CUDA 程序，并且正在学习。

这是我已经走了多远：

import pycuda.autoinit
import pycuda.driver as drv
import pycuda.compiler as comp

lerpFunction = """__global__ float lerp(float a, float b, float w)
{
    return a + w*(b-a);
}"""

mod = comp.SourceModule(lerpFunction) # This returns an error telling me a global must return a void. :(

对此的任何帮助都会很棒！

score 1 · Accepted Answer

错误消息非常明确 - CUDA 内核不能返回值，它们必须被声明void，并且可修改的参数作为指针传递。将您的 lerp 实现声明为这样的设备函数会更有意义：

__device__ float lerp(float a, float b, float w)
{
    return a + w*(b-a);
}

然后从内核内部为每个需要插值的值调用。您的 lerp 函数缺少很多“基础设施”来成为有用的 CUDA 内核。

编辑：一个非常基本的内核可能看起来像这样：

__global__ void lerp_kernel(const float *a, const float *b, const float w, float *y)
{
    int tid = threadIdx.x + blockIdx.x*blockDim.x; // unique thread number in the grid
    y[tid] = a[tid] + w*(b[tid]-a[tid]);
}

python - 使用 pycuda (lerp) 进行线性插值

1 回答 1

Related

Reference