c# - 将 T[,] 转换为 T[][] 的最快方法？

Question

所以事实证明，所有数组都不是平等的。多维数组可以有非零的下界。参见例如 Excel PIA 的 Range.Value 属性object[,] rectData = myRange.Value;

我需要将这些数据转换成锯齿状数组。我第一次尝试下面的复杂性。有什么优化它的建议吗？它需要处理下限可能不为零的一般情况。

我有这个前方法：

    public static T[][] AsJagged<T>( this T[,] rect )
    {
        int row1 = rect.GetLowerBound(0);
        int rowN = rect.GetUpperBound(0);
        int col1 = rect.GetLowerBound(1);
        int colN = rect.GetUpperBound(1);

        int height = rowN - row1 + 1;
        int width = colN - col1 + 1;
        T[][] jagged = new T[height][];

        int k = 0;
        int l;
        for ( int i = row1; i < row1 + height; i++ )
        {
            l = 0;
            T[] temp = new T[width];
            for ( int j = col1; j < col1 + width; j++ )
                temp[l++] = rect[i, j];
            jagged[k++] = temp;
        }

        return jagged;
    }

像这样使用：

    public void Foo()
    {
        int[,] iRect1 = { { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 } };
        int[][] iJagged1 = iRect1.AsJagged();

        int[] lengths = { 3, 5 };
        int[] lowerBounds = { 7, 8 };
        int[,] iRect2 = (int[,])Array.CreateInstance(typeof(int), lengths, lowerBounds);
        int[][] iJagged2 = iRect2.AsJagged();

    }

好奇Buffer.BlockCopy()是否可以工作或更快？

编辑： AsJagged 需要处理引用类型。

编辑：在 AsJagged() 中发现错误。添加int l; 并添加col1 + width到内循环。

score 7 · Accepted Answer

前面的观点警告/假设：

您似乎仅int将其用作您的数据类型（或者至少似乎可以使用Buffer.BlockCopy这意味着您通常可以使用原始类型）。
对于您显示的测试数据，我认为使用任何有点理智的方法都不会有太大的不同。

话虽如此，以下实现（需要专门针对特定的原始类型（此处int），因为它使用fixed）比使用内部循环的方法快大约 10 倍：

    unsafe public static int[][] AsJagged2(int[,] rect)
    {
        int row1 = rect.GetLowerBound(0);
        int rowN = rect.GetUpperBound(0);
        int col1 = rect.GetLowerBound(1);
        int colN = rect.GetUpperBound(1);

        int height = rowN - row1 + 1;
        int width = colN - col1 + 1;
        int[][] jagged = new int[height][];

        int k = 0;
        for (int i = row1; i < row1 + height; i++)
        {
            int[] temp = new int[width];

            fixed (int *dest = temp, src = &rect[i, col1])
            {
                MoveMemory(dest, src, rowN * sizeof(int));
            }

            jagged[k++] = temp;
        }

        return jagged;
    }

    [DllImport("kernel32.dll", EntryPoint = "RtlMoveMemory")]
    unsafe internal static extern void MoveMemory(void* dest, void* src, int length);

使用以下“测试代码”：

    static void Main(string[] args)
    {
        Random rand = new Random();
        int[,] data = new int[100,1000];
        for (int i = 0; i < data.GetLength(0); i++)
        {
            for (int j = 0; j < data.GetLength(1); j++)
            {
                data[i, j] = rand.Next(0, 1000);
            }
        }

        Stopwatch sw = Stopwatch.StartNew();

        for (int i = 0; i < 100; i++)
        {
            int[][] dataJagged = AsJagged(data);
        }

        Console.WriteLine("AsJagged:  " + sw.Elapsed);

        sw = Stopwatch.StartNew();

        for (int i = 0; i < 100; i++)
        {
            int[][] dataJagged2 = AsJagged2(data);
        }

        Console.WriteLine("AsJagged2: " + sw.Elapsed);
    }

其中AsJagged（第一种情况）是您的原始功能，我得到以下输出：

AsJagged:  00:00:00.9504376
AsJagged2: 00:00:00.0860492

所以确实有一种更快的方法，但是根据测试数据的大小、实际执行此操作的次数以及您允许不安全和 P/Invoke 代码的意愿，您可能不会需要它。

话虽如此，我们使用的大型矩阵double（比如 7000x10000 元素）确实产生了巨大的影响。

更新：关于使用 Buffer.BlockCopy

我可能会忽略一些Marshal或其他技巧，但我认为Buffer.BlockCopy这里不可能使用。这是因为它要求源数组和目标数组都必须是Array.

在我们的示例中，目标是一个数组（例如int[] temp = ...），但源不是。虽然我们“知道”对于原始类型的二维数组，布局是这样的，每个“行”（即第一维）都是内存中类型的数组，但没有安全（如unsafe）方法来获取该数组无需先复制它的开销。所以我们基本上需要使用一个只处理内存而不关心它的实际内容的函数——比如MoveMemory. 顺便说一句，内部实现Buffer.BlockCopy做了类似的事情。

score 6 · Accepted Answer

您的复杂性是 O(N*M) N - 行数， M - 列数。这是复制 N*M 值时可以得到的最好结果......

Buffer.BlockCopy 可能比您的内部 for 循环更快，但如果编译器知道如何正确处理此代码并且您不会获得任何进一步的速度，我不会感到惊讶。您应该对其进行测试以确保。

您可以通过完全不复制数据来获得更好的性能（以稍微慢一点的查找为代价）。如果您创建一个“数组行”类，它包含您的矩形和行号，并提供一个访问正确列的索引器，您可以创建一个此类行的数组，并完全保存自己的复制。

创建这样一个“数组行”数组的复杂性是 O(N)。

编辑：一个 ArrayRow 类，只是因为它困扰我......

ArrayRow 可能看起来像这样：

class ArrayRow<T>
{
    private T[,] _source;
    private int _row;

    public ArrayRow(T[,] rect, int row)
    {
         _source = rect;
         _row = row;
    }

    public T this[int col] { get { return _source[_row, col]; } }
}

现在您创建了一个 ArrayRows 数组，根本不复制任何内容，优化器很有可能优化按顺序访问整行。

c# - 将 T[,] 转换为 T[][] 的最快方法？

2 回答 2

Related

Reference