python - Python搜索两个单词正则表达式

Question

我正在尝试查找一个句子是否包含短语“go * to”，例如“go over to”、“go up to”等。我正在使用 Textblob，我知道我可以在下面使用：

search_go_to = set(["go", "to"])
go_to_blob = TextBlob(var)
matches = [str(s) for s in go_to_blob.sentences if search_go_to & set(s.words)]
print(matches)

但这也会返回诸如“去那里把这个带给他”之类的句子，这是我不想要的。任何人都知道我怎么能做类似 text.find("go * to") 的事情吗？

score 3 · Accepted Answer

3

尝试使用：

for match in re.finditer(r"go\s+\w+\s+to", text, re.IGNORECASE):

于 2015-01-17T22:05:01.863 回答

score 2 · Accepted Answer

利用generator expressions

>>> search_go_to = set(["go", "to"])
>>> m = ' .*? '.join(x for x in search_go_to)
>>> words = set(["go over to", "go up to", "foo bar"])
>>> matches = [s for s in words if re.search(m, s)]
>>> print(matches)
['go over to', 'go up to']

score 1 · Accepted Answer

尝试这个

text = "something go over to something"

if re.search("go\s+?\S+?\s+?to",text):
    print "found"
else:
    print "not found"

正则表达式：-

\s is for any space
\S is for any non space including special characters
+? is for no greedy approach (not required in OP's question)

所以re.search("go\s+?\S+?\s+?to",text)会匹配"something go W#$%^^$ to something"，当然这也是"something go over to something"

score 0 · Accepted Answer

这行得通吗？

import re
search_go_to = re.compile("^go.*to$")
go_to_blob = TextBlob(var)
matches = [str(s) for s in go_to_blob.sentences if search_go_to.match(str(s))]
print(matches)

正则表达式的解释：

^    beginning of line/string
go   literal matching of "go"
.*   zero or more characters of any kind
to   literal matching of "to"
$    end of line/string

如果您不希望“going to”匹配，请在之前和之后插入\\b（单词边界）。togo

python - Python搜索两个单词正则表达式

4 回答 4

Related

Reference