正如 Abbondanza 所提到的,如果您想使用正则表达式执行此操作,您将需要平衡组。我应该警告你,这不是一个好的解决方案。虽然 .NET 的正则表达式引擎是少数可以处理此类情况的引擎之一,但它仍然不是真正推荐的方法。您最好手动解析语言,这样可以更轻松地计算嵌套级别。
无论如何,只是为了向您展示,为什么正则表达式不适合生产软件中的这项任务,这里有一个正则表达式(使用RegexOptions.IgnorePatternWhitespace
and RegexOptions.Singleline
),它仍然做了一些简化的假设(我稍后会谈到):
(?<=\[!--@Else--\]) # Make sure that our match begins right after an else
# block.
[^\[]* # Match as many non-[ characters as possible (the actual
# statement)
(?= # This lookahead will assert that the previous statement
# was a top-level Else
(?<Depth>) # Push one capture onto the stack "Depth" (because, if
# this is one of the desired "Else"s we are exactly one
# level deep
(?> # Start a subpattern for anything that could follow and
# suppress backtracking (because the alternatives are
# mutually exclusive)
(?<Depth>\[!--@If\([^()]*\)--\])
# If we encounter an If block, push a new capture onto
# the stack (because the nesting level rises)
| # OR
(?<-Depth>)\[!--@EndIf--\]
# IF we can pop a capture from the stack, consume an
# EndIf. If we cannot, the named group will fail. Hence
# we can only consume one EndIf more than we already
# encountered Ifs.
| # OR
(?!\[!--@EndIf--\]). # If this character does not mark the beginning of an
# EndIf, consume an arbitrary character.
)* # Repeat as long as possible.
$ # Make sure we have reached the end of the string.
(?(Depth)(?!)) # If there is anything left on the stack, fail, too,
# because there are some Ifs that were not closed, so
# the syntax was invalid anyway.
# You can leave this out if you have convinced yourself
# beforehand that the overall nesting syntax is correct.
) # End of lookahead.
现在这已经是一头野兽了,如果没有这本评论小说,几乎没有人会理解。
但我提到了简化假设。干得好。
If
我不允许在条件内使用任何类型的括号。如果你想这样做,你也必须检查它们的正确嵌套。它比我在这里做的稍微简单一些,但它仍然需要上下一堆括号。
- 主要问题可能是实际匹配
[\[]]*
。由于我不允许任何类型的左括号,因此您不能在Else
块内包含条件语句。现在,如果您想允许这样做,您必须将几乎整个内容再次复制到实际匹配中,以便您知道哪些If
s 和EndIf
s 在里面Else
,哪些在后面。
您会看到,要获得涵盖 100% 所有情况的正则表达式解决方案,您需要使该代码完全不可维护。这就是为什么您应该真正考虑手动分析字符串并构建某种语法树的原因。通过这种方式,您可以获得嵌套结构的 OOP 表示,可以轻松遍历Else
您想要查找的特定 s。