c# - 确定文本文件中的行数

Question

有没有一种简单的方法可以以编程方式确定文本文件中的行数？

score 425 · Accepted Answer

严重迟到的编辑：如果您使用的是 .NET 4.0 或更高版本

该类File有一个新ReadLines方法，它懒惰地枚举行，而不是贪婪地将它们全部读入一个像ReadAllLines. 因此，现在您可以通过以下方式既高效又简洁：

var lineCount = File.ReadLines(@"C:\file.txt").Count();

原始答案

如果你不太在意效率，你可以简单地写：

var lineCount = File.ReadAllLines(@"C:\file.txt").Length;

对于更有效的方法，您可以这样做：

var lineCount = 0;
using (var reader = File.OpenText(@"C:\file.txt"))
{
    while (reader.ReadLine() != null)
    {
        lineCount++;
    }
}

编辑：回答有关效率的问题

我说第二个更有效的原因是关于内存使用，不一定是速度。第一个将文件的全部内容加载到一个数组中，这意味着它必须分配至少与文件大小一样多的内存。第二个只是一次循环一行，因此它一次不必分配超过一行的内存。这对于小文件不是那么重要，但对于较大的文件，这可能是一个问题（例如，如果您尝试在 32 位系统上查找 4GB 文件中的行数，那么根本就不够用户模式地址空间来分配这么大的数组）。

就速度而言，我不希望它有很多。ReadAllLines 可能有一些内部优化，但另一方面它可能必须分配大量内存。我猜 ReadAllLines 对于小文件可能会更快，但对于大文件则要慢得多；尽管唯一的判断方法是使用秒表或代码分析器对其进行测量。

score 13 · Accepted Answer

13

最简单的：

int lines = File.ReadAllLines("myfile").Length;

于 2008-09-23T07:28:07.007 回答

score 8 · Accepted Answer

这将使用更少的内存，但可能需要更长的时间

int count = 0;
string line;
TextReader reader = new StreamReader("file.txt");
while ((line = reader.ReadLine()) != null)
{
  count++;
}
reader.Close();

score 5 · Accepted Answer

如果简单是指易于破译但效率低下的代码行？

string[] lines = System.IO.File.RealAllLines($filename);
int cnt = lines.Count();

这可能是知道有多少行的最快方法。

你也可以这样做（取决于你是否正在缓冲它）

#for large files
while (...reads into buffer){
string[] lines = Regex.Split(buffer,System.Enviorment.NewLine);
}

还有其他许多方法，但上述方法之一可能就是您要使用的方法。

score 5 · Accepted Answer

读取文件本身需要一些时间，垃圾收集结果是另一个问题，因为您读取整个文件只是为了计算换行符，

在某些时候，有人将不得不读取文件中的字符，无论这是框架还是您的代码。这意味着如果文件很大，您必须打开文件并将其读入内存，这可能会成为问题，因为内存需要被垃圾收集。

Nima Ara 做了一个很好的分析，你可以考虑一下

这是建议的解决方案，因为它一次读取 4 个字符，计算换行字符并再次使用相同的内存地址进行下一个字符比较。

private const char CR = '\r';  
private const char LF = '\n';  
private const char NULL = (char)0;

public static long CountLinesMaybe(Stream stream)  
{
    Ensure.NotNull(stream, nameof(stream));

    var lineCount = 0L;

    var byteBuffer = new byte[1024 * 1024];
    const int BytesAtTheTime = 4;
    var detectedEOL = NULL;
    var currentChar = NULL;

    int bytesRead;
    while ((bytesRead = stream.Read(byteBuffer, 0, byteBuffer.Length)) > 0)
    {
        var i = 0;
        for (; i <= bytesRead - BytesAtTheTime; i += BytesAtTheTime)
        {
            currentChar = (char)byteBuffer[i];

            if (detectedEOL != NULL)
            {
                if (currentChar == detectedEOL) { lineCount++; }

                currentChar = (char)byteBuffer[i + 1];
                if (currentChar == detectedEOL) { lineCount++; }

                currentChar = (char)byteBuffer[i + 2];
                if (currentChar == detectedEOL) { lineCount++; }

                currentChar = (char)byteBuffer[i + 3];
                if (currentChar == detectedEOL) { lineCount++; }
            }
            else
            {
                if (currentChar == LF || currentChar == CR)
                {
                    detectedEOL = currentChar;
                    lineCount++;
                }
                i -= BytesAtTheTime - 1;
            }
        }

        for (; i < bytesRead; i++)
        {
            currentChar = (char)byteBuffer[i];

            if (detectedEOL != NULL)
            {
                if (currentChar == detectedEOL) { lineCount++; }
            }
            else
            {
                if (currentChar == LF || currentChar == CR)
                {
                    detectedEOL = currentChar;
                    lineCount++;
                }
            }
        }
    }

    if (currentChar != LF && currentChar != CR && currentChar != NULL)
    {
        lineCount++;
    }
    return lineCount;
}

在上图中，您可以看到底层框架一次读取一个字符，因为您需要读取所有字符才能看到换行符。

如果您将其描述为 done bay Nima，您会发现这是一种相当快速且有效的方法。

score 2 · Accepted Answer

您可以快速读取它并增加一个计数器，只需使用循环来增加，对文本不做任何事情。

score 1 · Accepted Answer

计算回车/换行。我相信在 unicode 中它们仍然分别是 0x000D 和 0x000A。这样你就可以随心所欲地高效或低效，并决定是否必须同时处理这两个角色

score 1 · Accepted Answer

一个可行的选项，也是我个人使用的一个选项，是将您自己的标题添加到文件的第一行。我这样做是为了我的游戏的自定义模型格式。基本上，我有一个工具可以优化我的 .obj 文件，摆脱我不需要的废话，将它们转换为更好的布局，然后将线、面、法线、顶点和纹理 UV 的总数写入第一行。然后，在加载模型时，各种数组缓冲区会使用该数据。

这也很有用，因为您只需要循环一次文件来加载它，而不是一次计算行数，然后再次将数据读入您创建的缓冲区。

score 0 · Accepted Answer

用这个：

    int get_lines(string file)
    {
        var lineCount = 0;
        using (var stream = new StreamReader(file))
        {
            while (stream.ReadLine() != null)
            {
                lineCount++;
            }
        }
        return lineCount;
    }

score -1 · Accepted Answer

try {
    string path = args[0];
    FileStream fh = new FileStream(path, FileMode.Open, FileAccess.Read);
    int i;
    string s = "";
    while ((i = fh.ReadByte()) != -1)
        s = s + (char)i;

    //its for reading number of paragraphs
    int count = 0;
    for (int j = 0; j < s.Length - 1; j++) {
            if (s.Substring(j, 1) == "\n")
                count++;
    }

    Console.WriteLine("The total searches were :" + count);

    fh.Close();

} catch(Exception ex) {
    Console.WriteLine(ex.Message);
}

score -3 · Accepted Answer

您可以启动“ wc .exe”可执行文件（UnixUtils附带，不需要安装）作为外部进程运行。它支持不同的行数方法（如 unix vs mac vs windows）。

c# - 确定文本文件中的行数

11 回答 11

Related

Reference