2

我正在尝试实现一个 GLSL 自旋锁,以便能够实现单程深度剥离。我遇到了麻烦,因为锁定纹理使用的例子很少。我不得不承认,我真的不知道自己在做什么,所以为了安全起见,我描述的上下文可能比必要的要多。

我写了一个片段程序,它应该什么都不做:

#version 420 core

//The lock texture holds either 0 or 1.
//0 means that the texture is available.
//1 means that the texture is locked.  
layout(r32ui) coherent uniform uimage2D img2D_0; //locking texture

layout(RGBA32F) coherent uniform image2D img2D_1; //data texture (currently unused)

void main() {
    ivec2 coord = ivec2(gl_FragCoord.xy);

    //The loop's exchange function swaps out the old value with 1.

    //If the locking texture was 0, 0 will be returned, terminating the loop;
    //the locking texture will now contain 1, indicating that the locking
    //texture is now locked.

    //Conversely, if the locking texture contains 1, then the exchange function
    //will write a 1 (so the texture is still locked), and return 1, indicating
    //that the texture is locked and unavailable.
    while (imageAtomicExchange(img2D_0,coord,1u)==1u);

    //The locking texture is locked.  More code would go here

    //This unlocks the texture.
    imageAtomicExchange(img2D_0,coord,0);
}

锁定纹理是这样创建的:

//data is an array initialized to all 0.
glTexImage2D(GL_TEXTURE_2D,0,GL_R32UI,size_x,size_y,0,GL_RED_INTEGER,GL_UNSIGNED_INT,data);

为了执行该算法,我采用了一个带有彩色 RGBA F32 渲染附件的 FBO 并启用它。我绑定了上面的着色器,然后将锁定纹理传递给 img2D_0 并将颜色附件传递给 img2D_1,使用以下代码:

glBindImageTextureEXT(
    /* 0, 1, respectively */,
    texture_id, 0,GL_FALSE,0, GL_READ_WRITE,
    /* GL_R32UI, GL_RGBA32F, respectively */
);

然后使用 VBO 呈现对象,并且一些辅助通道显示数据的内容。

The problem is that the fragment program given crashes the video driver (because it never terminates). My question is why? The texture is initialized to 0, and I'm pretty sure my logic for the exchange functions is valid. Is my setup and methodology basically correct?

4

1 回答 1

3

One issue is that if two threads in the same warp hit the same lock location, that warp will deadlock as one thread will acquire the lock and the other thread will loop, and the warp will continue executing the looping thread, which prevents the thread with the lock from ever making any progress.

edit

based on your revised pastebin, I would suggest something like:

bool done = false;
while (!done) {
    if ((done = (imageAtomicExchange(img2D_0,coord,1u)==0))) {
        // guarded operations
                 :
        imageStore(img2D_0, coord, 0);
    }
}

This avoids the warp loop deadlock as the threads left out are those that have already completed their locked modification. If only one thread can acquire its lock, that thread will make progress.

于 2012-08-04T19:16:06.080 回答