3

I have a list of words - "foo", "bar", "baz" - and I want to write a regexp which would match strings which contain at least 2 of them. E.g., "foo baz" should match while "ba foo z" should not.

The obvious solution "(foo|bar|baz).*(foo|bar|baz)" works, but I find it unsatisfactory because it lists the words twice. What if I have 25 words instead of just 3? What if I am looking for strings which contain at least 4 given words instead of just 2?

4

2 回答 2

4

听起来您不是在寻找确切的单词,因此 Donkey 的解决方案可能不是您想要的

((foo|bar|baz).*?){2}

它在文本中搜索任何这些字符串,然后搜索任何字符,直到再次找到这些可选字符串之一,并且由于惰性任何字符部分将通过不匹配任何内容来完成,因此匹配完成。

如果您希望它匹配多行,请务必打开 dot all,或使用 \s\S 而不是 dot。

于 2013-06-18T15:23:35.240 回答
0

我认为这个解决方案应该有效:

"(foo|bar|baz).*\s+\1(\s+|$)"

\s意味着需要一个空格字符来确保您找到确切的单词而不仅仅是前缀。例如,"foo ... fooo"无法识别。

于 2013-06-18T15:20:53.633 回答