python - 从字符串中获取 Twitter 用户名的简单正则表达式

Question

给定一个简单的字符串，比如"@dasweo where you at?"我想编写一个正则表达式来提取"dasweo".

到目前为止，我所拥有的是：

print re.findall(r"@\w{*}", "@dasweo where you at?")

但这不起作用。谁能帮我这个？

score 3 · Accepted Answer

删除{..}花括号，它们不用于*：

>>> re.findall(r"@\w*", "@dasweo where you at?")
['@dasweo']

仅使用{..}具有固定数字的量词：

\w{3}

例如，精确匹配 3 个字母。

score 2 · Accepted Answer

使用此模式：

print re.findall(r"@\w+", "@dasweo where you at?")

\w表示任何单词字符，而表示+一个或多个。

score 2 · Accepted Answer

你可以使用这个：

print re.findall(r"(?<=@)\w+", "@dasweo where you at?")

后视在哪里(?<=..)意味着：“之前”仅执行检查但不捕获。

score 2 · Accepted Answer

由于您不希望@被包含在匹配中，因此您可以使用积极的lookbehind：

>>> import re
>>> re.findall(r"(?<=@)\w+", "@dasweo where you at?")
['dasweo']

通常，形式的正则表达式(?<=X)Y匹配Y前面的X，但不包括X在实际匹配中。在您的情况下，Xis@和Yis \w+，一个或多个单词字符。单词字符可以是字母数字字符，也可以是下划线。

顺便说一句，有不止一种方法可以做到这一点。您还可以使用捕获组：

>>> [m.group(1) for m in re.finditer(r"@(\w+)", "@dasweo where you at?")]
['dasweo']

m.group(1)返回第一个捕获组的值。在这种情况下，这就是\w+.

4 回答 4