0

我正在尝试使用 opencl 编写纹理数据并使用 opengl 显示它。目标是比较具有和不具有互操作性的性能。但现在我无法生成纹理本身。我首先想尝试在没有互操作性的情况下做到这一点:

cl 缓冲区初始化和内核创建:

this->imageBuffer = new cl::Image2D(*context, CL_MEM_WRITE_ONLY, cl::ImageFormat(CL_RGBA, CL_FLOAT), this->width, this->height); //this->imageBuffer is a cl::Image*

//create kernel
this->kernel = new cl::Kernel(program, "simple_kernel");

//set kernel arguments
this->kernel->setArg(0, *(this->imageBuffer));
this->kernel->setArg(1, this->offset); 

内核执行(发生在循环中):

cl::size_t<3> origin;
origin[0] = 0; origin[1] = 0; origin[2] = 0;
cl::size_t<3>  range;
range[0] = this->width; range[1] = this->height; range[2] = 1;

//Not necessary needed, but shows my point
this->queue->enqueueWriteImage(*(this->imageBuffer), CL_TRUE, origin, range, 0, 0, this->imageOutput);

//enqueue kernel with NDRange
this->queue->enqueueNDRangeKernel(*(this->kernel), cl::NullRange, *(this->global_size), cl::NullRange);

this->queue->enqueueReadImage(*(this->imageBuffer), CL_TRUE, origin, range, 0, 0, this->imageOutput);

this->queue->finish();

std::cout << "fancy output: " << std::endl;;
for(int y = 0; y < this->height; y++) {
    for(int x = 0; x < this->width; x++) {
        std::cout << this->imageOutput[(y * this->width) + x] << ";";
    }
    std::cout << std::endl;
}

OpenCL 内核:

__kernel void simple_kernel(__global __write_only image2d_t texture, float offset) { //offset is not used for testing
    int x = get_global_id(0);
    int y = get_global_id(1);

    int2 pixelPos = (int2)(x, y);
    float4 pixelColor = (float4)(0.5f, 0.0f, 0.0f, 1.0f);

    write_imagef(texture, pixelPos, pixelColor);
};

我为类似问题找到的所有解决方案都与 glTexImage2D 中使用的内部格式和格式(在 OpenGL 渲染部分的开头使用)有关,所以这里实际上可能是同样的问题,但我没有看到我做错了什么。

预期的结果将是一个红色的四边形。但它只显示初始化的纹理(在这种情况下为白色。如果用 0.0f 初始化,则为黑色。如果用 0.4f 初始化,则为灰色)。通过将 writeImage 也加入队列,我能够缩小内核似乎根本没有更改缓冲区的范围。如果 writeImage 被注释掉,它会显示一个黑色四边形。所以读取缓冲区似乎有效,因为在这种情况下它读取一个空缓冲区(导致黑色四边形)。

事实强化了这一点,花哨的输出(在内核执行部分的末尾使用)仅打印初始化值(例如 1s 或 0.4s。或者在不使用 writeImage 时打印 0s)

4

1 回答 1

1

First, your "fancy output" part is wrong. Your image has width * height * 4 float elements. You treat it as having width * height elements.

auto pitch = this->width * 4;
std::cout << "fancy output: " << std::endl;
for(int y = 0; y < this->height; y++) {
    for(int x = 0; x < this->width; x++) {
        auto r = this->imageOutput[(y * pitch) + x * 4 + 0];
        auto g = this->imageOutput[(y * pitch) + x * 4 + 1];
        auto b = this->imageOutput[(y * pitch) + x * 4 + 2];
        auto a = this->imageOutput[(y * pitch) + x * 4 + 3];
        std::cout << r << ' ' << g << ' '<< b << ' ' << a << ';';
    }
    std::cout << '\n';
}

Second, your kernel fails to compile on my platform because you mark the image as both __global and __write_only image2d_t. You should omit the __global.

Third is your clEnqueueReadImage call. Your row pitch is 0 when its supposed to be width * 4 * sizeof(float).

于 2020-12-05T16:31:42.837 回答