1

I want to read a .txt file in C# but I will not read all the lines at the same time. For example, consider 500 lines of text file. I want a function to run 25 times and read 20 consecutive lines each time. In the first call of function, lines from 1 to 20 will be read, and second time it is called, 21-40 will be read.

Below simple code does this in c++ but I don't know how to achieve it in C#:

string readLines(ifstream& i)
{
     string totalLine="", line = "";
     for(int i = 0; i < 20; i++){
          getline(i, line);

          totalLine += line;
     }
     return totalLine;
}

int main()
{

     // ...
     ifstream in;
     in.open(filename.c_str());
     while(true){
         string next20 = readLines(in);
         // so something with 20 lines.
     }
     // ...

}
4

4 回答 4

3

这里有多种选择,但一种简单的方法是:

using (var reader = File.OpenText("file.txt"))
{
    for (int i = 0; i < 25; i++)
    {
        HandleLines(reader);
    }
}

...

private void HandleLines(TextReader reader)
{
    for (int i = 0; i < 20; i++)
    {
        string line = reader.ReadLine();
        if (line != null) // Handle the file ending early
        {
            // Process the line
        }
    }
}
于 2013-08-14T14:04:50.230 回答
2

如果试图尽可能少地调用次数LineRead()并且您希望使用最少的内存,您可以首先索引文件中的行:

  1. 解析文件一次并索引每一行在FileStream.
  2. 仅在所需位置调用ReadLine() 。

例如:

// Parse the file
var indexes = new List<long>();
using (var fs = File.OpenRead("text.txt"))
{
    indexes.Add(fs.Position);
    int chr;
    while ((chr = fs.ReadByte()) != -1)
    {
        if (chr == '\n')
        {                        
            indexes.Add(fs.Position);
        }
    }
}

int minLine = 21;
int maxLine = 40;

// Read the line
using (var fs = File.OpenRead("text.txt"))
{
    for(int i = minLine ; i <= maxLine ; i++)
    {
        fs.Position = indexes[ i ];
        using (var sr = new StreamReader(fs))
            Console.WriteLine(sr.ReadLine());

}

干杯!

于 2013-08-14T15:00:41.283 回答
1

您可以像这样编写 Batch() 方法:

public static IEnumerable<string> Batch(IEnumerable<string> input, int batchSize)
{
    int n = 0;
    var block = new StringBuilder();

    foreach (var line in input)
    {
        block.AppendLine(line);

        if (++n != batchSize)
            continue;

        yield return block.ToString();
        block.Clear();
        n = 0;
    }

    if (n != 0)
        yield return block.ToString();
}

并这样称呼它:

string filename = "<Your filename goes here>";
var batches = Batch(File.ReadLines(filename), 20);

foreach (var block in batches)
{
    Console.Write(block);
    Console.WriteLine("------------------------");
}
于 2013-08-14T14:46:28.063 回答
0

哎呀。GroupBy 不会懒惰地评估,所以这会贪婪地消耗整个文件

var twentyLineGroups = 
    File.ReadLines(somePath)
        .Select((line, index) => new {line, index})
        .GroupBy(x => x.index / 20)
        .Select(g => g.Select(x => x.line));

foreach(IEnumerable<string> twentyLineGroup in twentyLineGroups)
{
    foreach(string line in twentyLineGroup)
    {
        //tada!
    }
}

于 2013-08-14T14:07:34.937 回答