我正在使用 pyFFT 对二维数组进行傅里叶变换,然后继续使用另一个 OpenCL 程序(这里以双倍为例):
gpu_data = cl_array.to_device(queue, tData2D)
plan.execute(gpu_data.data)
eData2D = gpu_data.get()
ctx = cl.Context([cl.get_platforms()[0].get_devices()[0]])
queue = cl.CommandQueue(ctx)
mf = cl.mem_flags
eData2D_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=eData2D)
eData2D_dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, eData2D.nbytes)
prg = cl.Program(ctx, """
//#define PYOPENCL_DEFINE_CDOUBLE // uncomment for double support.
#include "pyopencl-complex.h"
__kernel void sum(const unsigned int ySize,
__global cfloat_t *a,
__global cfloat_t *b)
{
int gid0 = get_global_id(0);
int gid1 = get_global_id(1);
b[gid1 + ySize*gid0] = a[gid1 + ySize*gid0]+a[gid1 + ySize*gid0];
}
""").build()
prg.sum(queue, eData2D.shape, None, np.int32(Ny), eData2D_buf, eData2D_dest_buf)
cl.enqueue_copy(queue, eData2Dresult, eData2D_dest_buf)
这工作得很好。现在,我不想检索数据并将其重新复制到缓冲区中eData2D = gpu_data.get()
并将其直接复制回 GPU 内存eData2D_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=eData2D)
,而是继续使用它。
我期待这样的事情:
gpu_data = cl_array.to_device(queue, tData2D)
plan.execute(gpu_data.data)
ctx = cl.Context([cl.get_platforms()[0].get_devices()[0]])
queue = cl.CommandQueue(ctx)
mf = cl.mem_flags
eData2D_dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, eData2D.nbytes)
prg = cl.Program(ctx, """
//#define PYOPENCL_DEFINE_CDOUBLE // uncomment for double support.
#include "pyopencl-complex.h"
__kernel void sum(const unsigned int ySize,
__global cfloat_t *a,
__global cfloat_t *b)
{
int gid0 = get_global_id(0);
int gid1 = get_global_id(1);
b[gid1 + ySize*gid0] = a[gid1 + ySize*gid0]+a[gid1 + ySize*gid0];
}
""").build()
prg.sum(queue, eData2D.shape, None, np.int32(Ny), gpu_data.data, eData2D_dest_buf)
cl.enqueue_copy(queue, eData2Dresult, eData2D_dest_buf)
这没有用。有没有办法做到这一点?在此先感谢您的帮助。