python - 如何获取以“#”开头的所有术语？

Question

我有一个这样的字符串："sometext #Syrup #nshit #thebluntislit"

我想获取以“#”开头的所有术语的列表

我使用了以下代码：

import re
line = "blahblahblah #Syrup #nshit #thebluntislit"
ht = re.search(r'#\w*', line)
ht = ht.group(0)
print ht

我得到以下信息：

#Syrup

我想知道是否有一种方法可以代替我获得如下列表：

[#Syrup,#nshit,#thebluntislit]

对于以“#”开头的所有术语，而不仅仅是第一个术语。

score 21 · Accepted Answer

像 Python 这样好的编程语言不需要正则表达式：

  hashed = [ word for word in line.split() if word.startswith("#") ]

score 4 · Accepted Answer

您可以使用

compiled = re.compile(r'#\w*')
compiled.findall(line)

输出：

['#Syrup', '#nshit', '#thebluntislit']

但有一个问题。如果您搜索类似的字符串'blahblahblah #Syrup #nshit #thebluntislit beg#end'，输出将是['#Syrup', '#nshit', '#thebluntislit', '#end'].

这个问题可以通过使用积极的后视来解决：

compiled = re.compile(r'(?<=\s)#\w*')

（这里不可能使用\b（单词边界），因为#不在可能构成正在搜索边界的单词的\w符号中）。[0-9a-zA-Z_]

score 1 · Accepted Answer

1

看起来re.findall()会做你想做的事。

matches = re.findall(r'#\w*', line)

于 2011-12-01T20:10:26.563 回答

3 回答 3