python - Python 正则表达式 - 搜索和查找所有之间的区别

Question

我正在尝试在 URL 字符串上使用 python 正则表达式。

id= 'edu.vt.lib.scholar:http/ejournals/VALib/v48_n4/newsome.html'
>>> re.search('news|ejournals|theses',id).group()
'ejournals'
>>> re.findall('news|ejournals|theses',id)
['ejournals', 'news']

根据http://docs.python.org/2/library/re.html#finding-all-adverbs上的文档，它说 search() 匹配第一个并找到所有匹配字符串中所有可能的匹配项。

我想知道为什么“新闻”没有被搜索捕获，即使它是在模式中首先声明的。

我使用了错误的模式吗？我想搜索字符串中是否出现任何这些关键字。

score 4 · Accepted Answer

你倒着想。正则表达式遍历目标字符串寻找"news"OR "ejournals"OR"theses"并返回它找到的第一个。在这种情况下"ejournals"，首先出现在目标字符串中。

score 3 · Accepted Answer

该re.search()函数在第一次出现满足您的条件后停止，而不是模式中的第一个选项。

score 0 · Accepted Answer

请注意， search和findall之间还有一些其他差异，此处未说明。例如：

python-regex 为什么 findall 一无所获，但搜索有效？

score 0 · Accepted Answer

`id='edu.vt.lib.scholar:http/ejournals/VALib/v48_n4/newsome.html'

re.search('news|ejournals|theses',id).group() 'ejournals'

re.search -> 在字符串中搜索第一次出现然后退出。

re.findall('news|ejournals|theses',id) ['ejournals', 'news']

re.findall -> 在字符串中搜索所有匹配项并以列表形式返回。

python - Python 正则表达式 - 搜索和查找所有之间的区别

4 回答 4

re.search -> 在字符串中搜索第一次出现然后退出。

re.findall -> 在字符串中搜索所有匹配项并以列表形式返回。

Related

Reference