8

I am trying to translate CUDA code into OpenCL and right now I am stuck with these functions/variables:

  • __syncthreads()
  • blockIdx.x
4

3 回答 3

11

Actually I found it by my own! Here is a useful article: http://www.netlib.org/utk/people/JackDongarra/PAPERS/parcocudaopencl.pdf

The answer is: for the __syncthreads() use barrier(CLK_LOCAL_MEM_FENCE); for blockIdx.x use get_group_id(0)!

于 2013-03-05T13:06:14.530 回答
6

__syncthreads() -> barrier(_) but make sure to understand the difference between barrier(CLK_LOCAL_MEM_FENCE) and barrier(CLK_GLOBAL_MEM_FENCE) check this question or this documentation for more info.

blockIdx.x -> get_group_id(0) which will give you the first/x dimension id of the group/block

于 2013-03-05T15:57:21.867 回答
0

There are many pages on the web that can help you for porting CUDA to OpenCl (for example here). I want just to remark, as it point out here for the "barrier", that there are barrier(CLK_LOCAL_MEM_FENCE) and barrier(CLK_GLOBAL_MEM_FENCE): mainly the difference is that the first one ensuree correct ordering of memory operations when you are using the local (shared in CUDA) memory and the second when you are operating on the global memory. Be sure to use the correct one for your case.

于 2016-09-08T15:39:04.817 回答