我不知道任何可以返回所有有效匹配的正则表达式引擎。
但是我们可以应用一些逻辑来生成所有候选字符串并将其呈现给正则表达式。
通过枚举给定输入的所有可能子字符串来构造候选。
var str = "y z a a a b c c z y z a a a b c c z";
var regex = new Regex("(a )+(b )+(c *)c");
var length = str.Length;
for (int start = 1; start <= length;start++){
for (int groupLength = 1; start + groupLength - 1 <= length ;groupLength++){
var candidate = str.Substring(start-1,groupLength); //.Dump();
//("\"" + candidate + "\"").Dump();
var match = regex.Match(candidate);
if (match.Value == candidate )
{
candidate.Dump();
}
}
}
这给
a a a b c c
a a b c c
a b c c
这似乎是正确的答案,但与您的结果相矛盾:
a a a b c => I state that this is not a match
a a b c c ok
a a b c => I state that this is not a match
a b c c ok
a b c => I state that this is not a match
例如,您提供的正则表达式
(a )+(b )+(c *)c
与结果中的第一个条目不匹配
a a a b c
如果您认为起始位置不重要,上述逻辑可以生成相同的匹配。例如,如果您只是再次重复给定的输入:
"y z a a a b c c z y z a a a b c c z"
它会给:
a a a b c c
a a b c c
a b c c
a a a b c c
a a b c c
a b c c
如果您认为位置不重要,您应该对此结果进行区分
如果认为可能匹配,则应添加输入为空字符串的琐碎情况。
仅供参考,这是正则表达式检查的所有候选人
"y"
"y "
"y z"
"y z "
"y z a"
"y z a "
"y z a a"
"y z a a "
"y z a a a"
"y z a a a "
"y z a a a b"
"y z a a a b "
"y z a a a b c"
"y z a a a b c "
"y z a a a b c c"
"y z a a a b c c "
"y z a a a b c c z"
" "
" z"
" z "
" z a"
" z a "
" z a a"
" z a a "
" z a a a"
" z a a a "
" z a a a b"
" z a a a b "
" z a a a b c"
" z a a a b c "
" z a a a b c c"
" z a a a b c c "
" z a a a b c c z"
"z"
"z "
"z a"
"z a "
"z a a"
"z a a "
"z a a a"
"z a a a "
"z a a a b"
"z a a a b "
"z a a a b c"
"z a a a b c "
"z a a a b c c"
"z a a a b c c "
"z a a a b c c z"
" "
" a"
" a "
" a a"
" a a "
" a a a"
" a a a "
" a a a b"
" a a a b "
" a a a b c"
" a a a b c "
" a a a b c c"
" a a a b c c "
" a a a b c c z"
"a"
"a "
"a a"
"a a "
"a a a"
"a a a "
"a a a b"
"a a a b "
"a a a b c"
"a a a b c "
"a a a b c c"
"a a a b c c "
"a a a b c c z"
" "
" a"
" a "
" a a"
" a a "
" a a b"
" a a b "
" a a b c"
" a a b c "
" a a b c c"
" a a b c c "
" a a b c c z"
"a"
"a "
"a a"
"a a "
"a a b"
"a a b "
"a a b c"
"a a b c "
"a a b c c"
"a a b c c "
"a a b c c z"
" "
" a"
" a "
" a b"
" a b "
" a b c"
" a b c "
" a b c c"
" a b c c "
" a b c c z"
"a"
"a "
"a b"
"a b "
"a b c"
"a b c "
"a b c c"
"a b c c "
"a b c c z"
" "
" b"
" b "
" b c"
" b c "
" b c c"
" b c c "
" b c c z"
"b"
"b "
"b c"
"b c "
"b c c"
"b c c "
"b c c z"
" "
" c"
" c "
" c c"
" c c "
" c c z"
"c"
"c "
"c c"
"c c "
"c c z"
" "
" c"
" c "
" c z"
"c"
"c "
"c z"
" "
" z"
"z"
也很高兴知道 2 种主要类型的正则表达式(NFA 和 DFA)如何工作
来自http://msdn.microsoft.com/en-us/library/e347654k.aspx
.NET(我认为也是 JAVA)是 NFA 正则表达式引擎(与 DFA 相对),当它处理特定的语言元素时,引擎使用贪婪匹配;也就是说,它尽可能多地匹配输入字符串。但它也会在成功匹配子表达式后保存其状态。如果匹配最终失败,引擎可以返回到保存状态,以便尝试其他匹配。这种放弃成功的子表达式匹配以使正则表达式中的后续语言元素也可以匹配的过程称为回溯。