c# - 我的抖动算法超级慢

Question

所以这里有一些背景。我正在开发这款名为 ShiftOS 的游戏，它发生在一个操作系统中，该操作系统最初是 80 年代磨坊操作系统的基本运行，没有太多功能。

我正在尝试添加一个机制，用户必须从二进制（2 色）颜色深度开始，并且只能在屏幕上显示黑白。然后他们必须将颜色深度从 1 位升级到 2 位，再从 4 位升级到 24 位。这是一个非常简洁的机制，但在实践中似乎非常困难。

当然，这个时候的旧系统至少尝试过让图像看起来不错，但当然它们受到工程师提供的调色板的限制，所以他们不得不抖动图像以排列像素，使其看起来像图像使用了更多颜色，而实际上它只能使用 2。

所以我查找了一些不错的抖动算法并开始学习 Floyd-Steinberg 算法，并很快将其移植到C#和System.Drawing.

这是我使用的代码。

var bmp = new Bitmap(source.Width, source.Height);
var sourceBmp = (Bitmap)source;
int error = 0;
for (int y = 0; y < bmp.Height; y++)
{
    for (int x = 0; x < bmp.Width; x++)
    {
        Color c = sourceBmp.GetPixel(x, y);
        int gray = ((c.R + c.G + c.B) / 3);
        if (gray >= 127)
        {
            error = gray - 255;
            bmp.SetPixel(x, y, Color.White);
        }
        else
        {
            error = gray;
            bmp.SetPixel(x, y, Color.Black);
        }
        /*
         * Pixel error diffusion map: Floyd-Steinberg. Thanks to Wikipedia.
         * 
         *  pixel[x + 1][y    ] := pixel[x + 1][y    ] + quant_error * 7 / 16
         *  pixel[x - 1][y + 1] := pixel[x - 1][y + 1] + quant_error * 3 / 16
         *  pixel[x    ][y + 1] := pixel[x    ][y + 1] + quant_error * 5 / 16
         *  pixel[x + 1][y + 1] := pixel[x + 1][y + 1] + quant_error * 1 / 16
         */

        if(x - 1 >= 0 && y + 1 != bmp.Height)
        {
            var bottomRightColor = sourceBmp.GetPixel(x - 1, y + 1);
            int bottomRightGray = ((bottomRightColor.R + bottomRightColor.G + bottomRightColor.B) / 3) + ((error * 3) / 16);
            if (bottomRightGray < 0)
                bottomRightGray = 0;
            if (bottomRightGray > 255)
                bottomRightGray = 255;
            sourceBmp.SetPixel(x - 1, y + 1, Color.FromArgb(bottomRightGray, bottomRightGray, bottomRightGray));
        }
        if (x + 1 != sourceBmp.Width)
        {
            var rightColor = sourceBmp.GetPixel(x + 1, y);
            int rightGray = ((rightColor.R + rightColor.G + rightColor.B) / 3) + ((error * 7) / 16);
            if (rightGray < 0)
                rightGray = 0;
            if (rightGray > 255)
                rightGray = 255;
            sourceBmp.SetPixel(x + 1, y, Color.FromArgb(rightGray, rightGray, rightGray));
        }
        if (x + 1 != sourceBmp.Width && y + 1 != sourceBmp.Height)
        {
            var bottomRightColor = sourceBmp.GetPixel(x + 1, y + 1);
            int bottomRightGray = ((bottomRightColor.R + bottomRightColor.G + bottomRightColor.B) / 3) + ((error) / 16);
            if (bottomRightGray < 0)
                bottomRightGray = 0;
            if (bottomRightGray > 255)
                bottomRightGray = 255;
            sourceBmp.SetPixel(x + 1, y + 1, Color.FromArgb(bottomRightGray, bottomRightGray, bottomRightGray));
        }
        if (y + 1 != sourceBmp.Height)
        {
            var bottomColor = sourceBmp.GetPixel(x, y + 1);
            int bottomGray = ((bottomColor.R + bottomColor.G + bottomColor.B) / 3) + ((error * 5) / 16);
            if (bottomGray < 0)
                bottomGray = 0;
            if (bottomGray > 255)
                bottomGray = 255;
            sourceBmp.SetPixel(x, y + 1, Color.FromArgb(bottomGray, bottomGray, bottomGray));
        }
    }
}

请注意，这source是Image通过参数传递给函数的。

此代码运行良好，但问题是，抖动发生在单独的线程上，以最大程度地减少游戏中的减速/滞后，并且在发生抖动时，会显示操作系统的常规 24 位颜色/图像。如果抖动不需要这么长时间，这会很好。

但是我注意到该代码在此代码中的算法非常慢，并且根据我正在抖动的图像的大小，抖动过程可能需要超过一分钟！

我已经应用了我能想到的所有优化——例如在与游戏线程不同的线程中运行事物，并在线程完成时调用赋予函数的动作，但这只会节省一点时间（如果有的话）。

所以我想知道是否有任何进一步的优化可以使这个操作更快，如果可能的话总共几秒钟。我还想指出，当抖动操作发生时，我有明显的系统滞后——鼠标有时甚至会抖动和跳跃。对于那些必须拥有 60FPS PC 大师赛的家伙来说，这并不酷。

score 0 · Accepted Answer

首先出现在我脑海中的是处理Bitmap数组。默认情况下，它不是一个选项，因为没有界面可以做到这一点，但你可以通过一些技巧来实现这一点。快速搜索跟着我到了这个答案。因此，您必须将方法设置为unsafe，使用获取像素值LockBits，并使用指针数学访问它们（请参阅原始答案以获取完整代码）：

System.Drawing.Imaging.BitmapData bmpData =
    bmp.LockBits(rect, System.Drawing.Imaging.ImageLockMode.ReadWrite,
    bmp.PixelFormat);
var pt = (byte*)bmpData.Scan0;
// for loop
var row = pt + (y * bmpData.Stride);
var pixel = row + x * bpp; // bpp is a number of dimensions for the bitmap

pixel将是一个数组，其中包含有关在byte值中编码的颜色的信息。正如您已经看到的那样，GetPixel并且SetPixel速度很慢，因为它们实际上是LockBits为了保证操作而调用的。Array 将帮助您删除读取操作，但是，“SetPixel”仍然可能是一个瓶颈，因为您可能需要尽快更新位图。如果您最终可以一次全部更新它，那么就这样做。

第二个想法是创建一些Task队列，它将逐步更新您的数组。如我所见，您从一个角度更新图像，因此，也许您可以设置更新的并行版本。也许您可以通过版本控制创建一个当前状态的不可变数组，所以最后您只需总结新版本的 bmp。

score 0 · Accepted Answer

@VMAtm 的答案可能是最重要的。

            if (bottomRightGray < 0)
                bottomRightGray = 0;
            if (bottomRightGray > 255)
                bottomRightGray = 255;

可能被重构为

bottomRightGray = Clamp(bottomRightGray, 0, 255);

如果使用一些 ASM 魔法实现，则可能会提高性能。

((error * X) / 16)

可以在程序中为四个中的每一个预先计算一次X，因为误差只能是 0..255，制作一个 256*4 值的表格。这可能会或可能不会提高速度。

score -2 · Accepted Answer

您必须获得一个像素的缓冲区，而不是使用 Get/Set 函数
您必须避免在循环内进行计算。通过预先计算可能的情况来减少它们
完成后，将像素放回图像
您可以使用有序抖动，因为它使用更少的计算并且比 Floyd-Steinberg 快得多。它产生良好的质量，并在显示器颜色较少时及时使用。

此代码使用 C 语言编写，但您可以轻松地将其重写为 C#。

#define f7_16   112
#define f5_16    80
#define f3_16    48
#define f1_16    16

//  Black-white Floyd-Steinberg dither
void    makeDitherFS( BYTE* pixels, int width, int height ) noexcept
{
    const int   size    = width * height;

    int*    error   = (int*)malloc( size * sizeof(int) );

    //  Clear the errors buffer.
    memset( error, 0, size * sizeof(int) );

    //~~~~~~~~

    int i   = 0;

    for( int y = 0; y < height; y++ )
    {
        BYTE*   prow   = pixels + ( y * width * 3 );

        for( int x = 0; x < width; x++,i++ )
        {
            const int   blue    = prow[x * 3 + 0];
            const int   green   = prow[x * 3 + 1];
            const int   red     = prow[x * 3 + 2];

            //  Get the pixel gray value.
            int newVal  = (red+green+blue)/3 + (error[i] >> 8); //  PixelGray + error correction

            int newc    = (newVal < 128 ? 0 : 255);
            prow[x * 3 + 0] = newc; //  blue
            prow[x * 3 + 1] = newc; //  green
            prow[x * 3 + 2] = newc; //  red

            //  Correction - the new error
            const int   cerror  = newVal - newc;

            int idx = i+1;
            if( x+1 < width )
                error[idx] += (cerror * f7_16);

            idx += width - 2;
            if( x-1 > 0 && y+1 < height )
                error[idx] += (cerror * f3_16);

            idx++;
            if( y+1 < height )
                error[idx] += (cerror * f5_16);

            idx++;
            if( x+1 < width && y+1 < height )
                error[idx] += (cerror * f1_16);
        }
    }

    free( error );
}

有关更多抖动算法，您可以在此处查看

c# - 我的抖动算法超级慢

3 回答 3

Related

Reference