cuda - What is the OpenCL analogue for CUDA's __syncthreads() and blockIdx.x?

Question

I am trying to translate CUDA code into OpenCL and right now I am stuck with these functions/variables:

__syncthreads()
blockIdx.x

score 11 · Accepted Answer

Actually I found it by my own! Here is a useful article: http://www.netlib.org/utk/people/JackDongarra/PAPERS/parcocudaopencl.pdf

The answer is: for the __syncthreads() use barrier(CLK_LOCAL_MEM_FENCE); for blockIdx.x use get_group_id(0)!

score 6 · Accepted Answer

__syncthreads() -> barrier(_) but make sure to understand the difference between barrier(CLK_LOCAL_MEM_FENCE) and barrier(CLK_GLOBAL_MEM_FENCE) check this question or this documentation for more info.

blockIdx.x -> get_group_id(0) which will give you the first/x dimension id of the group/block

score 0 · Accepted Answer

There are many pages on the web that can help you for porting CUDA to OpenCl (for example here). I want just to remark, as it point out here for the "barrier", that there are barrier(CLK_LOCAL_MEM_FENCE) and barrier(CLK_GLOBAL_MEM_FENCE): mainly the difference is that the first one ensuree correct ordering of memory operations when you are using the local (shared in CUDA) memory and the second when you are operating on the global memory. Be sure to use the correct one for your case.

cuda - What is the OpenCL analogue for CUDA's __syncthreads() and blockIdx.x?

3 回答 3

Related

Reference