您的非平凡正则表达式:^[a-zA-Z]([-\s\.a-zA-Z]*('(?!'))?[-\s\.a-zA-Z]*)*$
,最好用自由间距模式的注释编写,如下所示:
Regex re_orig = new Regex(@"
^ # Anchor to start of string.
[a-zA-Z] # First char must be letter.
( # $1: Zero or more additional parts.
[-\s\.a-zA-Z]* # Zero or more valid name chars.
( # $2: optional quote.
' # Allow quote but only
(?!') # if not followed by quote.
)? # End $2: optional quote.
[-\s\.a-zA-Z]* # Zero or more valid name chars.
)* # End $1: Zero or more additional parts.
$ # Anchor to end of string.
",RegexOptions.IgnorePatternWhitespace);
在英语中,这个正则表达式本质上是说:“匹配一个以字母开头的字符串,[a-zA-Z]
后跟零个或多个字母、空格、句点、连字符或单引号,但每个单引号后面可能不会紧跟另一个单引号。”
请注意,您上面的正则表达式允许奇怪的名称,例如:"ABC---...'... -.-.XYZ "
这可能是您需要的,也可能不是。它还允许多行输入和以空格结尾的字符串。
上述正则表达式的“无限循环”问题是,当将此正则表达式应用于连续包含两个单引号的长无效输入时,会发生灾难性的回溯。这是一个等效的模式,它匹配(并且不匹配)完全相同的字符串,但不会经历灾难性的回溯:
Regex re_fixed = new Regex(@"
^ # Anchor to start of string.
[a-zA-Z] # First char must be letter.
[-\s.a-zA-Z]* # Zero or more valid name chars.
(?: # Zero or more isolated single quotes.
' # Allow single quote but only
(?!') # if not followed by single quote.
[-\s.a-zA-Z]* # Zero or more valid name chars.
)* # Zero or more isolated single quotes.
$ # Anchor to end of string.
",RegexOptions.IgnorePatternWhitespace);
在您的代码上下文中,它是简短的形式:
const string PATTERN = @"^[a-zA-Z][-\s.a-zA-Z]*(?:'(?!')[-\s.a-zA-Z]*)*$";