c# - 逐行阅读

Question

我有一个生成纯文本文件的程序。结构（布局）始终相同。例子：

文本文件：

LinkLabel
"Hello, this text will appear in a LinkLabel once it has been
added to the form. This text may not always cover more than one line. But will always be surrounded by quotation marks."
240, 780

因此，要解释该文件中发生的事情：

Control
Text
Location

当单击表单上的按钮，并且用户从 OpenFileDialog 对话框打开其中一个文件时，我需要能够读取每一行。从顶部开始，我想检查它是什么控件，然后从第二行开始，我需要能够获取引号内的所有文本（无论是一行文本还是更多），并且在下一行（右引号之后），我需要提取位置（240、780）......我已经想到了几种方法但是当我把它写下来并付诸实践时，这没有多大意义，最终会找出行不通的方法。

以前有人做过吗？有人可以就我将如何做这件事提供任何帮助、建议或建议吗？

我查找了 CSV 文件，但对于看起来如此简单的东西来说，这似乎太复杂了。

谢谢杰斯

score 2 · Accepted Answer

我会尝试写下算法，我解决这些问题的方式（在评论中）：

// while not at end of file
  // read control
  // read line of text
  // while last char in line is not "
    // read line of text
  // read location

尝试编写代码来执行每条评论所说的内容，您应该能够弄清楚。

HTH。

score 2 · Accepted Answer

您可以使用正则表达式从文本中获取行：

MatchCollection lines = Regex.Matches(File.ReadAllText(fileName), @"(.+?)\r\n""([^""]+)""\r\n(\d+), (\d+)\r\n");
foreach (Match match in lines) {
   string control = match.Groups[1].Value;
   string text = match.Groups[2].Value;
   int x = Int32.Parse(match.Groups[3].Value);
   int y = Int32.Parse(match.Groups[4].Value);
   Console.WriteLine("{0}, \"{1}\", {2}, {3}", control, text, x, y);
}

score 2 · Accepted Answer

您正在尝试实现解析器，最好的策略是将问题分成更小的部分。你需要一个TextReader能让你读行的类。

您应该将您的ReadControl方法分成三个方法：ReadControlType, ReadText, ReadLocation。每个方法只负责读取它应该读取的项目并将其留TextReader在下一个方法可以拾取的位置。像这样的东西。

public Control ReadControl(TextReader reader)
{
    string controlType = ReadControlType(reader);
    string text = ReadText(reader);
    Point location = ReadLocation(reader);
    ... return the control ...
}

当然，ReadText 是最有趣的，因为它跨越多行。事实上，它是一个循环调用TextReader.ReadLine，直到行以引号结束：

private string ReadText(TextReader reader)
{
    string text;
    string line = reader.ReadLine();
    text = line.Substring(1); // Strip first quotation mark.
    while (!text.EndsWith("\"")) {
        line = reader.ReadLine();
        text += line;
    }
    return text.Substring(0, text.Length - 1); // Strip last quotation mark.
}

score 1 · Accepted Answer

这种东西很烦人，它在概念上很简单，但你最终可能会得到粗糙的代码。你有一个相对简单的例子：每个文件一个记录，如果你有很多记录，它会变得更加困难，并且你想很好地处理格式错误的记录（考虑为诸如 C#.

对于大规模问题，可能会使用语法驱动的解析器，例如：link text

您的大部分复杂性来自文件中缺乏规律性。第一个字段由 nwline 终止，第二个由引号分隔，第三个由逗号终止...

我的第一个建议是调整文件的格式，使其非常容易解析。你编写文件，这样你就可以控制了。例如，文本中不要有新行，并且每个项目都在自己的行中。然后，您只需阅读四行，就完成了。

c# - 逐行阅读

文本文件：

4 回答 4

Related

Reference