2

我正在做一个项目,我必须向 cuda 内核发送一个结构数组。该结构还包含一个数组。为了测试它,我编写了一个简单的程序。

struct Point {
    short     x;
    short     *y;
};

我的内核代码:

__global__ void addKernel(Point *a, Point *b, Point *c)
{
    int i = threadIdx.x;

    c[i].x = a[i].x + b[i].x;
    for (int j = 0; j<4; j++){
        c[i].y[j] = a[i].y[j] + a[i].y[j];
    }
}

我的主要代码:

int main()
{
    const int arraySize = 4;
    const int arraySize2 = 4;

    short *ya, *yb, *yc;
    short *dev_ya, *dev_yb, *dev_yc;

    Point *a;
    Point *b;
    Point *c;
    Point *dev_a;
    Point *dev_b;
    Point *dev_c;

    size_t sizeInside = sizeof(short) * arraySize2;

    ya = (short *)malloc(sizeof(short) * arraySize2);
    yb = (short *)malloc(sizeof(short) * arraySize2);
    yc = (short *)malloc(sizeof(short) * arraySize2);

    ya[0] = 1; ya[1] =2; ya[2]=3; ya[3]=4;
    yb[0] = 2; yb[1] =3; yb[2]=4; yb[3]=5;

    size_t sizeGeneral = (sizeInside+sizeof(short)) * arraySize;

    a = (Point *)malloc( sizeGeneral );  
    b = (Point *)malloc( sizeGeneral );
    c = (Point *)malloc( sizeGeneral );


    a[0].x = 2;  a[0].y = ya;
    a[1].x = 2;  a[1].y = ya;
    a[2].x = 2;  a[2].y = ya;
    a[3].x = 2;  a[3].y = ya;

    b[0].x = 4;  b[0].y = yb;
    b[1].x = 4;  b[1].y = yb;
    b[2].x = 4;  b[2].y = yb;
    b[3].x = 4;  b[3].y = yb;

    cudaMalloc((void**)&dev_a, sizeGeneral);
    cudaMalloc((void**)&dev_b, sizeGeneral);
    cudaMalloc((void**)&dev_c, sizeGeneral);

    cudaMemcpy(dev_a, a, sizeGeneral, cudaMemcpyHostToDevice);
    cudaMemcpy(dev_b, b, sizeGeneral, cudaMemcpyHostToDevice);

    addKernel<<<1, 4>>>(dev_a, dev_b, dev_c);

    cudaError_t err = cudaMemcpy(c, dev_c, sizeGeneral, cudaMemcpyDeviceToHost);   

    printf("{%d-->%d,%d,%d,%d} \n err= %d",c[0].x,c[0].y[0],c[1].y[1],c[1].y[2],c[2].y[3], err);        

    cudaFree(dev_a);
    cudaFree(dev_b);
    cudaFree(dev_c);

    return 0;
}

似乎 cuda 内核无法正常工作。实际上我可以访问 structs 'x' 变量,但我不能访问 'y' 数组。我该怎么做才能访问“y”数组?提前致谢。

4

2 回答 2

1

当您将此结构发送到内核时,您会在主机内存而不是设备中发送 short 和指针。这是至关重要的。对于简单类型 - 这很有效,因为内核在内存中指定了其本地副本以接受参数。因此,当您调用此内核时,您已将x和移动y到设备,而不是y. 您必须通过为其分配空间并更新指向设备内存的指针y来手动执行此操作。

于 2013-04-12T07:41:22.297 回答
0

您没有将数组传递给设备。您可以通过像这样定义数组来使数组成为结构的一部分:

struct {
  short normalVal;
  short inStructArr[4];
}

或者将数组传递到设备内存并更新结构中的指针。

于 2013-04-12T07:51:32.733 回答