python - Python 正则表达式匹配，只要没有字符

Question

我在使用另一个正则表达式时遇到了一些麻烦。对于这个，我的代码应该寻找模式：

re.compile(r"kill(?:ed|ing|s)\D*(\d+).*?(?:men|women|children|people)?")

但是，它的匹配过于激进。它恰好匹配一个包含“杀死”这个词的句子。但是该模式会继续收集，直到它在文本中进一步向下达到一个数字。特别是，它匹配：

killed in an apparent u.s. drone attack on a car in yemen on sunday, tribal sources and local officials said.the men's car was driving through the south-eastern province of maareb, a mostly desert region where militants have taken refuge after being driven from southern strongholds.yemen, where al qaeda militants exploited a security vacuum during last year's uprising that ousted president ali abdullah saleh, has seen an in10

这不是我所追求的行为。如果在一个句子中找不到这种模式，我希望它失败。

我试图用伪代码实现的解决方案是：

find instance of 'kill'
if what follows contains a period (\.) before a digit, do not match.

我失败的实现如下所示：

re.compile(r"kill(?:ed|ing|s)\D*(?!:\..*?)(\d+).*?(?:men|women|children|people)?")

我尝试了“后视”，但我必须指定一个宽度。我试图用上面做的是匹配任何'kill'的结尾，然后是任何非数字，但不匹配一个句点，并且在我之后的数字之前可以自由跟随任何其他内容。

可悲的是，这段代码在我的测试中表现得完全一样。任何帮助，将不胜感激。

score 3 · Accepted Answer

一个小修改：

r"kill(?:ed|ing|s)[^\d.]*(\d+)[^.]*?(?:men|women|children|people)?"

.基本上，我防止在 kill 和 men/women/etc 之间匹配句号。之后。

python - Python 正则表达式匹配，只要没有字符

1 回答 1

Related

Reference