c# - 具有超过 65535^2 个元素的二维数组 --> 数组尺寸超出了支持的范围

Question

我有一台具有 128 GB RAM 的 64 位 PC，我正在使用 C# 和 .NET 4.5。我有以下代码：

double[,] m1 = new double[65535, 65535];
long l1 = m1.LongLength;

double[,] m2 = new double[65536, 65536]; // Array dimensions exceeded supported range
long l2 = m2.LongLength;

我知道<gcAllowVeryLargeObjects enabled="true" />并已将其设置为 true。

为什么一个多维数组不能有超过 4294967295 个元素？我看到了以下答案https://stackoverflow.com/a/2338797/7556646。

我还检查了gcAllowVeryLargeObjects的文档，我看到了以下评论。

数组中的最大元素数为UInt32.MaxValue (4294967295)。

我不明白为什么会有这个限制？有解决方法吗？是否计划在即将发布的 .net 版本中取消此限制？

我需要内存中的元素，因为我想计算例如使用英特尔 MKL 的对称特征值分解。

[DllImport("custom_mkl", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true, SetLastError = false)]
internal static extern lapack_int LAPACKE_dsyevd(
    int matrix_layout, char jobz, char uplo, lapack_int n, [In, Out] double[,] a, lapack_int lda, [In, Out] double[] w);

score 7 · Accepted Answer

免责声明：这个结果比预期的要长

为什么 CLR 不支持大数组

CLR 不支持托管堆上的大型数组的原因有很多。

其中一些是技术性的，其中一些可能是“范式”。

这篇博客文章探讨了为什么存在限制的一些原因。本质上，由于内存碎片，决定限制（大写 O）对象的最大大小。实现处理较大对象的成本与存在需要如此大对象的用例并不多这一事实相权衡，而那些需要如此大对象的用例——在大多数情况下——是由于程序员的设计谬误造成的。因为对于 CLR，一切都是对象，所以这个限制也适用于数组。为了强制执行此限制，数组索引器设计为有符号整数。

但是一旦你确定你的程序设计需要你有这么大的数组，你就需要一个解决方法。

上面提到的博客文章还演示了您可以在不进入非托管领域的情况下实现大数组。

但正如 Evk 在评论中指出的那样，您希望通过 PInvoke 将整个数组传递给外部函数。这意味着您将需要非托管堆上的数组，否则必须在调用期间对其进行编组。对于这么大的数组，编组整个事情是一个坏主意。

解决方法

因此，由于托管堆是不可能的，您需要在非托管堆上分配空间并将该空间用于您的阵列。

假设您需要 8 GB 的空间：

long size = (1L << 33);
IntPtr basePointer = System.Runtime.InteropServices.Marshal.AllocHGlobal((IntPtr)size);

伟大的！现在您在虚拟内存中有一个区域，您可以在其中存储高达 8 GB 的数据。

我如何把它变成一个数组？

那么C#中有两种方法

“不安全”的方法

这将让您使用指针。并且可以将指针转换为数组。（在香草 C 中，它们通常是相同的）

如果您对如何通过指针实现 2D 数组有一个好主意，那么这将是您的最佳选择。

这是一个指针

“元帅”方法

您不需要不安全的上下文，而是必须将数据从托管堆“编组”到非托管堆。您仍然必须了解指针运算。

您要使用的两个主要功能是PtrToStructure和反向StructureToPtr。使用一个，您将从非托管堆上的指定位置获得值类型（例如双精度）的副本。使用另一个，您将在非托管堆上放置一个值类型的副本。

从某种意义上说，这两种方法都是“不安全的”。你需要知道你的指针

常见的陷阱包括但不限于：

忘记严格检查界限
混合我的元素的大小
弄乱对齐方式
混合你想要什么样的二维数组
忘记使用 2D 数组进行填充
忘记释放内存
忘记释放内存并使用它

您可能希望将 2D 阵列设计转变为 1D 阵列设计

在任何情况下，您都希望使用适当的检查和析构函数将其全部包装到一个类中。

灵感的基本例子

接下来是一个基于非托管堆的“类似”数组的泛型类。

特点包括：

它有一个接受 64 位整数的索引访问器。
它限制了T可以变成值类型的类型。
它有边界检查并且是一次性的。

如果你注意到了，我不做任何类型检查，所以如果Marshal.SizeOf不能返回正确的数字，我们就掉进了上面提到的坑之一。

您必须自己实现的功能包括：

2D 访问器和 2D 数组算术（取决于其他库的期望，通常类似于p = x * size + y
用于 PInvoke 目的的公开指针（或内部调用）

因此，如果有的话，仅将其用作灵感。

using static System.Runtime.InteropServices.Marshal;

public class LongArray<T> : IDisposable where T : struct {
    private IntPtr _head;
    private Int64 _capacity;
    private UInt64 _bytes;
    private Int32 _elementSize;

    public LongArray(long capacity) {
        if(_capacity < 0) throw new ArgumentException("The capacity can not be negative");
        _elementSize = SizeOf(default(T));
        _capacity = capacity;
        _bytes = (ulong)capacity * (ulong)_elementSize;

        _head = AllocHGlobal((IntPtr)_bytes);   
    }

    public T this[long index] {
        get {
            IntPtr p = _getAddress(index);

            T val = (T)System.Runtime.InteropServices.Marshal.PtrToStructure(p, typeof(T));

            return val;
        }
        set {
            IntPtr p = _getAddress(index);

            StructureToPtr<T>(value, p, true);
        }
    }

    protected bool disposed = false;
    public void Dispose() {
        if(!disposed) {
            FreeHGlobal((IntPtr)_head);
            disposed = true;
        }
    }

    protected IntPtr _getAddress(long index) {
        if(disposed) throw new ObjectDisposedException("Can't access the array once it has been disposed!");
        if(index < 0) throw new IndexOutOfRangeException("Negative indices are not allowed");
        if(!(index < _capacity)) throw new IndexOutOfRangeException("Index is out of bounds of this array");
        return (IntPtr)((ulong)_head + (ulong)index * (ulong)(_elementSize));
    }
}

score 1 · Accepted Answer

我已经使用来自MrPaulch的这个答案的“元帅”方法的基本示例来创建以下名为的类：HugeMatrix<T>

public class HugeMatrix<T> : IDisposable
    where T : struct
{
    public IntPtr Pointer
    {
        get { return pointer; }
    }

    private IntPtr pointer = IntPtr.Zero;

    public int NRows
    {
        get { return Transposed ? _NColumns : _NRows; }
    }

    private int _NRows = 0;

    public int NColumns
    {
        get { return Transposed ? _NRows : _NColumns; }
    }

    private int _NColumns = 0;

    public bool Transposed
    {
        get { return _Transposed; }
        set { _Transposed = value; }
    }

    private bool _Transposed = false;

    private ulong b_element_size = 0;
    private ulong b_row_size = 0;
    private ulong b_size = 0;
    private bool disposed = false;


    public HugeMatrix()
        : this(0, 0)
    {
    }

    public HugeMatrix(int nrows, int ncols, bool transposed = false)
    {
        if (nrows < 0)
            throw new ArgumentException("The number of rows can not be negative");
        if (ncols < 0)
            throw new ArgumentException("The number of columns can not be negative");
        _NRows = transposed ? ncols : nrows;
        _NColumns = transposed ? nrows : ncols;
        _Transposed = transposed;
        b_element_size = (ulong)(Marshal.SizeOf(typeof(T)));
        b_row_size = (ulong)_NColumns * b_element_size;
        b_size = (ulong)_NRows * b_row_size;
        pointer = Marshal.AllocHGlobal((IntPtr)b_size);
        disposed = false;
    }

    public HugeMatrix(T[,] matrix, bool transposed = false)
        : this(matrix.GetLength(0), matrix.GetLength(1), transposed)
    {
        int nrows = matrix.GetLength(0);
        int ncols = matrix.GetLength(1);
        for (int i1 = 0; i1 < nrows; i1++)
            for (int i2 = 0; i2 < ncols; i2++)
                this[i1, i2] = matrix[i1, i2];
    }

    public void Dispose()
    {
        if (!disposed)
        {
            Marshal.FreeHGlobal(pointer);
            _NRows = 0;
            _NColumns = 0;
            _Transposed = false;
            b_element_size = 0;
            b_row_size = 0;
            b_size = 0;
            pointer = IntPtr.Zero;
            disposed = true;
        }
    }

    public void Transpose()
    {
        _Transposed = !_Transposed;
    }

    public T this[int i_row, int i_col]
    {
        get
        {
            IntPtr p = getAddress(i_row, i_col);
            return (T)Marshal.PtrToStructure(p, typeof(T));
        }
        set
        {
            IntPtr p = getAddress(i_row, i_col);
            Marshal.StructureToPtr(value, p, true);
        }
    }

    private IntPtr getAddress(int i_row, int i_col)
    {
        if (disposed)
            throw new ObjectDisposedException("Can't access the matrix once it has been disposed");
        if (i_row < 0)
            throw new IndexOutOfRangeException("Negative row indices are not allowed");
        if (i_row >= NRows)
            throw new IndexOutOfRangeException("Row index is out of bounds of this matrix");
        if (i_col < 0)
            throw new IndexOutOfRangeException("Negative column indices are not allowed");
        if (i_col >= NColumns)
            throw new IndexOutOfRangeException("Column index is out of bounds of this matrix");
        int i1 = Transposed ? i_col : i_row;
        int i2 = Transposed ? i_row : i_col;
        ulong p_row = (ulong)pointer + b_row_size * (ulong)i1;
        IntPtr p = (IntPtr)(p_row + b_element_size * (ulong)i2);
        return p;
    }
}

我现在可以调用具有巨大矩阵的英特尔 MKL 库，例如：

[DllImport("custom_mkl", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true, SetLastError = false)]
internal static extern lapack_int LAPACKE_dsyevd(
    int matrix_layout, char jobz, char uplo, lapack_int n, [In, Out] IntPtr a, lapack_int lda, [In, Out] double[] w);

对于参数IntPtr a，我传递了类的Pointer属性HugeMatrix<T>。