0

我正在学习 CUDA,但仍处于初级水平。我正在尝试一个简单的任务,但是当我运行它时我的代码崩溃了,我不知道为什么。任何帮助,将不胜感激。

编辑:cudaMemcpy在结构上和Image结构上崩溃pixelVal,类型为int**. 是这个原因吗?

原始 C++ 代码:

void Image::reflectImage(bool flag, Image& oldImage)
/*Reflects the Image based on users input*/
{
    int rows = oldImage.N;
    int cols = oldImage.M;
    Image tempImage(oldImage);

    for(int i = 0; i < rows; i++)
    {
        for(int j = 0; j < cols; j++)
        tempImage.pixelVal[rows - (i + 1)][j] = oldImage.pixelVal[i][j];
    }
    oldImage = tempImage;
}

我的 CUDA 内核和代码:

#define NTPB 512
__global__ void fliph(int* a, int* b, int r, int c)
{
    int i = blockIdx.x * blockDim.x + threadIdx.x;
    int j = blockIdx.y * blockDim.y + threadIdx.y;

    if (i >= r || j >= c)
        return;
    a[(r - i * c) + j] = b[i * c + j];
}
void Image::reflectImage(bool flag, Image& oldImage)
/*Reflects the Image based on users input*/
{
    int rows = oldImage.N;
    int cols = oldImage.M;
    Image tempImage(oldImage);
    if(flag == true) //horizontal reflection
    {
     //Allocate device memory
     int* dpixels;
     int* oldPixels;
     int n = rows * cols;
     cudaMalloc((void**)&dpixels, n * sizeof(int));
     cudaMalloc((void**)&oldPixels, n * sizeof(int));
     cudaMemcpy(dpixels, tempImage.pixelVal, n * sizeof(int), cudaMemcpyHostToDevice);
     cudaMemcpy(oldPixels, oldImage.pixelVal, n * sizeof(int), cudaMemcpyHostToDevice);
     int nblks = (n + NTPB - 1) / NTPB;
     fliph<<<nblks, NTPB>>>(dpixels, oldPixels, rows, cols);
     cudaMemcpy(tempImage.pixelVal, dpixels, n * sizeof(int), cudaMemcpyDeviceToHost);
     cudaFree(dpixels);
     cudaFree(oldPixels);
    }
    oldImage = tempImage;
}
4

1 回答 1

1

您必须创建 2D 网格才能使用 2D 索引处理图像ij. 在当前情况下,内核只处理图像的第一行。

要创建 2D 索引机制,请创建一个 2D 块和 2D 网格,如下所示:

const int BLOCK_DIM = 16;

dim3 Block(BLOCK_DIM,BLOCK_DIM);

dim3 Grid;
Grid.x = (cols + Block.x - 1)/Block.x;
Grid.y = (rows + Block.y - 1)/Block.y;

fliph<<<Grid, Block>>>(dpixels, oldPixels, rows, cols);
于 2013-04-04T18:03:42.183 回答