c# - 如何使用“parallel for”而不是使用几个“for”？

Question

我正在尝试为 sobel 编写更快的代码，但我无法理解将它用于多个 for 循环？

我应该使用与循环数一样多的并行吗？

这个有效果吗？

有人可以在代码上解释一下吗：这是代码：

for (int y = 0; y < Image.Height; y++)
{
    for (int x = 0; x < Image.Width * 3; x += 3)
    {
        r_x = g_x = b_x = 0; //reset the gradients in x-direcion values
        r_y = g_y = b_y = 0; //reset the gradients in y-direction values
        location = x + y * ImageData.Stride; //to get the location of any pixel >> location = x + y * Stride
        for (int yy = -(int)Math.Floor(weights_y.GetLength(0) / 2.0d), yyy = 0; yy <= (int)Math.Floor(weights_y.GetLength(0) / 2.0d); yy++, yyy++)
        {
            if (y + yy >= 0 && y + yy < Image.Height) //to prevent crossing the bounds of the array
            {
                for (int xx = -(int)Math.Floor(weights_x.GetLength(1) / 2.0d) * 3, xxx = 0; xx <= (int)Math.Floor(weights_x.GetLength(1) / 2.0d) * 3; xx += 3, xxx++)
                {
                    if (x + xx >= 0 && x + xx <= Image.Width * 3 - 3) //to prevent crossing the bounds of the array
                    {
                        location2 = x + xx + (yy + y) * ImageData.Stride; //to get the location of any pixel >> location = x + y * Stride

                        sbyte weight_x = weights_x[yyy, xxx];
                        sbyte weight_y = weights_y[yyy, xxx];
                        //applying the same weight to all channels
                        b_x += buffer[location2] * weight_x;
                        g_x += buffer[location2 + 1] * weight_x; //G_X
                        r_x += buffer[location2 + 2] * weight_x;
                        b_y += buffer[location2] * weight_y;
                        g_y += buffer[location2 + 1] * weight_y;//G_Y
                        r_y += buffer[location2 + 2] * weight_y;
                    }
                }
            }
        }
        //getting the magnitude for each channel
        b = (int)Math.Sqrt(Math.Pow(b_x, 2) + Math.Pow(b_y, 2));
        g = (int)Math.Sqrt(Math.Pow(g_x, 2) + Math.Pow(g_y, 2));//G
        r = (int)Math.Sqrt(Math.Pow(r_x, 2) + Math.Pow(r_y, 2));

        if (b > 255) b = 255;
        if (g > 255) g = 255;
        if (r > 255) r = 255;

        //getting grayscale value
        grayscale = (b + g + r) / 3;

        //thresholding to clean up the background
        //if (grayscale < 80) grayscale = 0;
        buffer2[location] = (byte)grayscale;
        buffer2[location + 1] = (byte)grayscale;
        buffer2[location + 2] = (byte)grayscale;
        //thresholding to clean up the background
        //if (b < 100) b = 0;
        //if (g < 100) g = 0;
        //if (r < 100) r = 0;

        //buffer2[location] = (byte)b;
        //buffer2[location + 1] = (byte)g;
        //buffer2[location + 2] = (byte)r;
    }
}

score 4 · Accepted Answer

最重要的问题是：工作是否可以并行化，并且您使用的对象模型是否支持并发。纯粹与数学相关且结果不可累积的事物往往是可并行化的，但我无法评论对象模型的线程安全性。不能保证（默认值通常为“否”）。

至于在哪里：

嵌套并行性没有什么意义。并行性有开销，放大这些开销会适得其反。处理并行性的最有效方法是考虑“块状”——即相对少量的非平凡操作（但希望至少与可用的 CPU 内核一样多），而不是大量的琐碎操作。因此，放置并行性最有效的地方通常是：最外层循环。在这种情况下，这似乎映射到图像中的行，这似乎是对图像处理进行分区的合理方式。您可以将其划分为 Nths（对于 CPU 内核 N），但老实说：我怀疑行会正常工作，并使事情保持简单。

然而！请注意，您需要避免 shared state: r_x，g_x并且需要在并行部分内b_x声明部分（以及任何其他共享本地）的相同状态，以确保它们是独立的。其他要看的东西：, , , , , , , . 看看这些东西当前在哪里声明会很好，但我怀疑它们都需要移动，以便在并行部分内声明它们。检查所有声明的本地变量，以及访问的所有字段。_ygrayscalelocationlocation2rgbyyyxxx

它看起来像buffer并且buffer2只是输入/输出数组，在这种情况下：在这种情况下它们应该可以正常工作。

c# - 如何使用“parallel for”而不是使用几个“for”？

1 回答 1

Related

Reference