6

我正在尝试解决我应该将彩色图像更改为灰度图像的问题。为此,我正在使用 CUDA 并行方法。

我在 GPU 上调用的内核代码如下。

__global__
void rgba_to_greyscale(const uchar4* const rgbaImage,
                   unsigned char* const greyImage,
                   int numRows, int numCols)
{
    int absolute_image_position_x = blockIdx.x;  
    int absolute_image_position_y = blockIdx.y;

  if ( absolute_image_position_x >= numCols ||
   absolute_image_position_y >= numRows )
 {
     return;
 }
uchar4 rgba = rgbaImage[absolute_image_position_x + absolute_image_position_y];
float channelSum = .299f * rgba.x + .587f * rgba.y + .114f * rgba.z;
greyImage[absolute_image_position_x + absolute_image_position_y] = channelSum;

}

void your_rgba_to_greyscale(const uchar4 * const h_rgbaImage,
                            uchar4 * const d_rgbaImage,
                            unsigned char* const d_greyImage,
                            size_t numRows,
                            size_t numCols)
{
  //You must fill in the correct sizes for the blockSize and gridSize
  //currently only one block with one thread is being launched
  const dim3 blockSize(numCols/32, numCols/32 , 1);  //TODO
  const dim3 gridSize(numRows/12, numRows/12 , 1);  //TODO
  rgba_to_greyscale<<<gridSize, blockSize>>>(d_rgbaImage,
                                             d_greyImage,
                                             numRows,
                                             numCols);

  cudaDeviceSynchronize(); checkCudaErrors(cudaGetLastError());
}


我在第一个像素行中看到一行点。

我得到的错误是

libdc1394 错误:无法初始化 libdc1394
位置 51 的差异超出容差 5
参考:255
GPU:0
我的输入/输出图像 谁能帮我解决这个问题?提前致谢。

4

12 回答 12

6

我最近参加了这门课程并尝试了您的解决方案,但它不起作用,我尝试了自己的解决方案。你几乎是正确的。正确的解决方案是这样的:

__global__`
void rgba_to_greyscale(const uchar4* const rgbaImage,
               unsigned char* const greyImage,
               int numRows, int numCols)
{`

int pos_x = (blockIdx.x * blockDim.x) + threadIdx.x;
int pos_y = (blockIdx.y * blockDim.y) + threadIdx.y;
if(pos_x >= numCols || pos_y >= numRows)
    return;

uchar4 rgba = rgbaImage[pos_x + pos_y * numCols];
greyImage[pos_x + pos_y * numCols] = (.299f * rgba.x + .587f * rgba.y + .114f * rgba.z); 

}

其余的与您的代码相同。

于 2015-10-17T14:46:32.383 回答
5

现在,自从我发布了这个问题以来,我一直在不断地解决
这个问题,现在我意识到我最初的解决方案是错误的,应该做一些改进来解决这个问题。
要做的改变:-

 1. absolute_position_x =(blockIdx.x * blockDim.x) + threadIdx.x;
 2. absolute_position_y = (blockIdx.y * blockDim.y) + threadIdx.y;

第二,

 1. const dim3 blockSize(24, 24, 1);
 2. const dim3 gridSize((numCols/16), (numRows/16) , 1);

在解决方案中,我们使用 numCols/16 * numCols/16 的网格
和 24 * 24 的块大小

代码在 0.040576 毫秒内执行

@datenwolf:感谢您在上面的回答!!!

于 2013-02-06T04:03:33.283 回答
2

由于您不知道图像大小。最好选择二维线程块的任何合理尺寸,然后检查两个条件。第一个是内核中的pos_xpos_y索引不超过numRowsnumCols。其次,网格大小应该略高于所有块中的线程总数。

const dim3 blockSize(16, 16, 1);
const dim3 gridSize((numCols%16) ? numCols/16+1 : numCols/16,
(numRows%16) ? numRows/16+1 : numRows/16, 1);
于 2016-09-06T19:26:55.550 回答
1
__global__
void rgba_to_greyscale(const uchar4* const rgbaImage,
                       unsigned char* const greyImage,
                       int numRows, int numCols)
{
    int rgba_x = blockIdx.x * blockDim.x + threadIdx.x;
    int rgba_y = blockIdx.y * blockDim.y + threadIdx.y;
    int pixel_pos = rgba_x+rgba_y*numCols;

    uchar4 rgba = rgbaImage[pixel_pos];
    unsigned char gray = (unsigned char)(0.299f * rgba.x + 0.587f * rgba.y + 0.114f * rgba.z);
    greyImage[pixel_pos] = gray;
}

void your_rgba_to_greyscale(const uchar4 * const h_rgbaImage, uchar4 * const d_rgbaImage,
                            unsigned char* const d_greyImage, size_t numRows, size_t numCols)
{
    //You must fill in the correct sizes for the blockSize and gridSize
    //currently only one block with one thread is being launched
    const dim3 blockSize(24, 24, 1);  //TODO
    const dim3 gridSize( numCols/24+1, numRows/24+1, 1);  //TODO
    rgba_to_greyscale<<<gridSize, blockSize>>>(d_rgbaImage, d_greyImage, numRows, numCols);

    cudaDeviceSynchronize(); checkCudaErrors(cudaGetLastError());
}
于 2014-03-10T07:15:14.180 回答
1

libdc1394 错误:无法初始化 libdc1394

我不认为这是一个 CUDA 问题。libdc1394 是一个用于访问 IEEE1394 aka FireWire aka iLink 视频设备(DV 摄像机、Apple iSight 摄像头)的库。该库没有正确初始化,因此您没有得到有用的结果。基本上它是 NINO: Nonsens In Nonsens Out。

于 2013-02-05T16:07:41.727 回答
1

绝对 x 和 y 图像位置的计算是完美的。但是当您需要访问彩色图像中的特定像素时,您不应该使用以下代码吗?

uchar4 rgba = rgbaImage[absolute_image_position_x + (absolute_image_position_y * numCols)];

我是这么认为的,当将它与您编写的代码进行比较时,您可以在串行代码中执行相同的问题。请告诉我 :)

于 2013-05-30T04:58:09.967 回答
1

您仍然应该在运行时遇到问题 - 转换不会给出正确的结果。

这些行:

  1. uchar4 rgba = rgbaImage[absolute_image_position_x + absolute_image_position_y];
  2. grayImage[absolute_image_position_x + absolute_image_position_y] = channelSum;

应改为:

  1. uchar4 rgba = rgbaImage[absolute_image_position_x + absolute_image_position_y*numCols];
  2. grayImage[absolute_image_position_x + absolute_image_position_y*numCols] = channelSum;
于 2013-10-14T06:50:04.707 回答
1

在这种情况下,libdc1394 错误与火线等无关 - 它是 udacity 用于将程序创建的图像与参考图像进行比较的库。也就是说,对于该位置,您的图像和参考图像之间的差异已超过特定阈值,即。像素。

于 2015-07-12T05:17:10.173 回答
0

您正在运行以下数量的块和网格:

  const dim3 blockSize(numCols/32, numCols/32 , 1);  //TODO
  const dim3 gridSize(numRows/12, numRows/12 , 1);  //TODO

但是您没有在内核代码中使用任何线程!

 int absolute_image_position_x = blockIdx.x;  
 int absolute_image_position_y = blockIdx.y;

这样想,图像的宽度可以分为absolute_image_position_x列的部分,图像的高度可以分为absolute_image_position_y行的部分。现在,它创建的每个横截面的框都需要以 grayImage 的形式并行更改/重绘所有像素。一个任务的剧透就够了:)

于 2013-02-06T00:20:54.260 回答
0

具有处理非标准输入尺寸图像的相同代码

int idx=blockDim.x*blockIdx.x+threadIdx.x;
int idy=blockDim.y*blockIdx.y+threadIdx.y;

uchar4 rgbcell=rgbaImage[idx*numCols+idy];

   greyImage[idx*numCols+idy]=0.299*rgbcell.x+0.587*rgbcell.y+0.114*rgbcell.z;


  }

  void your_rgba_to_greyscale(const uchar4 * const h_rgbaImage, uchar4 * const d_rgbaImage,
                        unsigned char* const d_greyImage, size_t numRows, size_t numCols)
 {
 //You must fill in the correct sizes for the blockSize and gridSize
 //currently only one block with one thread is being launched

int totalpixels=numRows*numCols;
int factors[]={2,4,8,16,24,32};
vector<int> numbers(factors,factors+sizeof(factors)/sizeof(int));
int factor=1;

   while(!numbers.empty())
  {
 if(totalpixels%numbers.back()==0)
 {
     factor=numbers.back();
     break;
 }
   else
   {
  numbers.pop_back();
   }
 }



 const dim3 blockSize(factor, factor, 1);  //TODO
 const dim3 gridSize(numRows/factor+1, numCols/factor+1,1);  //TODO
 rgba_to_greyscale<<<gridSize, blockSize>>>(d_rgbaImage, d_greyImage,    numRows, numCols);
于 2015-06-25T22:39:59.737 回答
0

1-int x =(blockIdx.x * blockDim.x) + threadIdx.x;

2-int y = (blockIdx.y * blockDim.y) + threadIdx.y;

并在网格和块大小

1-const dim3 blockSize(32, 32, 1);

2-const dim3 gridSize((numCols/32+1), (numRows/32+1) , 1);

代码在 0.036992 毫秒内执行。

于 2017-02-09T15:34:04.323 回答
0
const dim3 blockSize(16, 16, 1);  //TODO
const dim3 gridSize( (numRows+15)/16, (numCols+15)/16, 1);  //TODO

int x = blockIdx.x * blockDim.x + threadIdx.x;  
int y = blockIdx.y * blockDim.y + threadIdx.y;

uchar4 rgba = rgbaImage[y*numRows + x];
float channelSum = .299f * rgba.x + .587f * rgba.y + .114f * rgba.z;
greyImage[y*numRows + x] = channelSum;
于 2017-07-19T03:30:43.777 回答