1

我的 CUDA 代码在调试模式下产生正确的结果。但是,在发布模式下,相同的代码会产生垃圾结果。线程之间的同步在调试模式和发布模式之间会表现不同吗?

4

2 回答 2

2

Code generated with -O0 results in less optimal code and significantly more global and local memory accesses which may be hide a race condition. If you think you may have a race condition in shared memory you can try to the new CUDA 5.0 preview memory checker which supports some forms of race condition detection. Your best bet is to look for any location where you shared memory between two threads and determine if you are missing a thread fence of sync threads.

于 2012-05-25T01:37:14.837 回答
1

我认为,您遇到了比赛条件问题。您可以重新组织代码并在需要的地方添加同步。在调试模式下,您的线程通常按顺序执行,您不会遇到此问题。

于 2012-05-23T19:59:35.280 回答