python - Python 正则表达式，两个否定的前瞻语句

Question

我有一串语法分析的文本：

 s = 'ROOT (S (VP (VP (VB the) (SBAR (S (NP (DT same) (NN lecturer)) (VP (VBZ says)'

我想将“相同”与 s 匹配。'the' 和 'same' 仅在被句法标记（即 (、NP、S 等）分隔时才匹配是关键。因此，'same' 不应该在 s2 中找到匹配项：

 s2= 'ROOT (S (VP (VP (VB the) (SBAR (S (NP (DT lecturer) (NN same)) (VP (VBZ says)'

我尝试了双重否定前瞻断言无济于事：

 >>>rx = r'the(?![a-z]*)same(?![a-z]*)'
 >>>re.findall(rx,s)
 []

这个想法是在没有小写字符时匹配'the'，然后在没有小写字符时匹配'same'。

有没有人有更好的方法？

score 1 · Accepted Answer

the因此，如果和之间的所有字符都不是小写字母，则您想要匹配same，这是您可以在正则表达式中编写的方法：

the[^a-z]*same

请注意，您可能还想添加单词边界，因此您不匹配类似foothe ... samebar的内容，如下所示：

\bthe\b[^a-z]*\bsame\b

1 回答 1