我正在编写一个解析器来解析一个包含 UN EDIFACT 代码列表的结构化文本文件。为此,我用 C# 编写了一个通用状态机,我想用它来创建解析器。
我遇到的一个问题是如何正确匹配 70 个破折号的序列?每次遇到破折号时,我是否需要增加某种计数器,然后处理任何必要的操作?我无法找到任何解释如何做到这一点的地方。
这是我要解析的文本文件的快照:
PART 5 UNITED NATIONS DIRECTORIES FOR ELECTRONIC DATA INTERCHANGE
FOR ADMINISTRATION, COMMERCE AND TRANSPORT
CHAPTER 6 Code list
1. Code list UNCL
Change indicators
a plus sign (+) for an addition
an asterisk (*) for an addition/subtraction/change to an entry
for a particular data element
a hash sign (#) for changes to names
a vertical bar (|) for changes to text for descriptions,
notes and functions
a letter X (X) for marked for deletion
Usage indicators
[B] = used in batch messages only
[I] = used in interactive messages only
[C] = common usage in both batch and interactive messages
----------------------------------------------------------------------
* 1001 Document name code [C]
Desc: Code specifying the document name.
Repr: an..3
1 Certificate of analysis
Certificate providing the values of an analysis.
2 Certificate of conformity
Certificate certifying the conformity to predefined
definitions.