python - 提取多个实例正则表达式python

Question

我有一个字符串：

This is @lame

在这里我想提取la脚。但是问题来了，上面的字符串可以

This is lame

在这里我不提取任何东西。然后这个字符串可以是：

This is @lame but that is @not

在这里我提取跛脚而不是

因此，我在每种情况下期望的输出是：

 [lame]
 []
 [lame,not]

如何在 python 中以稳健的方式提取这些？

score 3 · Accepted Answer

用于re.findall()查找多种模式；在这种情况下，对于任何前面的@，由单词字符组成：

re.findall(r'(?<=@)\w+', inputtext)

该(?<=..)构造是一个积极的后向断言；仅当当前位置前面有一个@字符时才匹配。因此，仅当这些字符前面带有符号时，上述模式才匹配 1 个或多个单词字符（\w字符类）。@

演示：

>>> import re
>>> re.findall(r'(?<=@)\w+', 'This is @lame')
['lame']
>>> re.findall(r'(?<=@)\w+', 'This is lame')
[]
>>> re.findall(r'(?<=@)\w+', 'This is @lame but that is @not')
['lame', 'not']

如果您打算重用该模式，请先编译表达式，然后在已编译的正则表达式对象上使用该.findall()方法：

at_words = re.compile(r'(?<=@)\w+')

at_words.findall(inputtext)

这可以为您节省每次调用时的缓存查找.findall()。

score 1 · Accepted Answer

这将给出您要求的输出：

import re
regex = re.compile(r'(?<=@)\w+')
print regex.findall('This is @lame')
print regex.findall('This is lame')
print regex.findall('This is @lame but that is @not')

score 1 · Accepted Answer

你应该使用 re lib 这里是一个例子：

import re
test case = "This is @lame but that is @not"
regular = re.compile("@[\w]*")
lst= regular.findall(test case)

python - 提取多个实例正则表达式python

3 回答 3

Related

Reference