我正在尝试构建一个有点像这样的正则表达式:
[match-word] ... [exclude-specific-word] ... [match-word]
这似乎适用于消极的前瞻性,但是当我遇到这样的情况时遇到了问题:
[match-word] ... [exclude-specific-word] ... [match-word] ... [excluded word appears again]
我希望上面的句子匹配,但是第一个和第二个匹配的单词之间的否定前瞻“溢出”,所以第二个单词永远不会匹配。
让我们看一个实际的例子。
我不想匹配每个包含单词“i”和单词“pie”的句子,但不匹配这两个单词之间的单词“hate”。我有这三句话:
i sure like eating pie, but i love donuts <- Want to match this
i sure like eating pie, but i hate donuts <- Want to match this
i sure hate eating pie, but i like donuts <- Don't want to match this
我有这个正则表达式:
^i(?!.*hate).*pie - have removed the word boundaries for clarity, original is: ^i\b(?!.*\bhate\b).*\bpie\b
匹配第一句,但不匹配第二句,因为否定前瞻扫描整个字符串。
有没有办法限制负前瞻,让它在遇到“仇恨”之前遇到“馅饼”就满足了?
注意:在我的实现中,这个正则表达式后面可能还有其他术语(它是从语法搜索引擎动态构建的),例如:
^i(?!.*hate).*pie.*donuts
我目前正在使用 JRegex,但如有必要可能会切换到 JDK Regex
更新:我忘了在我最初的问题中提到一些东西:
句子中可能存在“否定结构”,如果可能的话,即使“否定”结构存在更远的位置,我也确实希望匹配该句子。
为了澄清,看看这些句子:
i sure like eating pie, but i love donuts <- Want to match this
i sure like eating pie, but i hate donuts <- Want to match this
i sure hate eating pie, but i like donuts <- Don't want to match this
i sure like eating pie, but i like donuts and i hate making pie <- Do want to match this
rob 的答案非常适合这个额外的约束,所以我接受了那个。