c# - 访问超过 1 000 000 个数字的最佳方式 C#

Question

所以问题是：我有一个带有 *.sld 扩展的文件。该文件包含大约 94 列和 24500 行的数字，可以作为普通文本文件读取。从程序中访问这些号码的最佳方式是什么？例如，我希望第 15 列中的所有数字都存储为双精度数。我有什么选择？我已经尝试过 dataTable，但是使用 File.ReadAllLines 加载整个文件需要大约 150MB 的 RAM 内存来运行程序，我必须考虑到程序将使用多个这样的文件。*.sld 文件如下所示：

0.000    96.47     2.51     1.43     2.56     2.47     5.83 -> more columns
1.030    96.47     2.52     1.39     3.14     2.43     5.60  |
2.044    96.47     2.43     1.63     2.96     2.34     5.86  \/
3.058    96.47     2.47     0.76     2.59     2.44     5.62  more rows
4.072    96.47     2.56     1.39     2.99     2.38     5.89

除了前面提到的更多列和行。我的解决方案是这样的：

//Read all lines of opened file to string array
string[] lines = System.IO.File.ReadAllLines(@OFD.FileName,Encoding.Default);
//Remove more than one whitespace with only one whitespace in cycle (cycle not shown)
string partialLine = Regex.Replace(lines[i], @"\s+", " ");
//Split string to string array and add it to dataTable
string[] partialLineElement = partialLine.Split(new char[]{' '}, StringSplitOptions.RemoveEmptyEntries);
fileData.Rows.Add(partialLineElement);

但是我在访问整列数据时遇到问题，它是一个字符串数组，而不是双数。我需要它将此文件的一列作为双 [] 添加到 ZedGraph。我也尝试将此 dataTable 分配给 dataGridView 为：

dataGridView1.DataSource = fileData;
dataGridView1.Refresh();

但是如何以 double[] 的形式访问列？有什么建议么？

score 1 · Accepted Answer

但是如何以 double[] 的形式访问列？有什么建议么？

您可以使用File.ReadLineswhich 不会将整个文件加载到内存中。

ReadLines 和 ReadAllLines 方法的区别如下：使用 ReadLines 时，可以在返回整个集合之前开始枚举字符串集合；使用 ReadAllLines 时，必须等待返回整个字符串数组才能访问该数组。因此，当您处理非常大的文件时，ReadLines 会更有效率。

double[] col4 = File.ReadLines(filename)
                .Select(line => line.Split(new char[]{' '},StringSplitOptions.RemoveEmptyEntries))
                .Select(p => double.Parse(p[4],CultureInfo.InvariantCulture))
                .ToArray();

获取所有列

double[][] allCols = File.ReadLines(filename)
                    .Select(line => line.Split(new char[]{' '},StringSplitOptions.RemoveEmptyEntries))
                    .Select(p => p.Select(s => double.Parse(s, CultureInfo.InvariantCulture)).ToArray())
                    .ToArray();

score 0 · Accepted Answer

我过去曾使用 StreamReader 从示例文件中导入大约 30,000 行，将每一行解析为 30 个不同的单元格，然后将其导入数据库。读取和解析只需几秒钟。你可以试一试。只要确保在“使用”语句中使用它。

至于解析第 15 列，我想不出比只写一个函数更好的方法。

c# - 访问超过 1 000 000 个数字的最佳方式 C#

2 回答 2

Related

Reference