我想用python整理句子中以's'开头的单词。
这是我的代码:
import re
text = "I was searching my source to make a big desk yesterday."
m = re.findall(r'[s]\w+', text)
print m
但是代码的结果是:
['searching', 'source', 'sk', 'sterday'].
如何编写有关正则表达式的代码?或者,有什么方法可以整理单词吗?
>>> import re
>>> text = "I was searching my source to make a big desk yesterday."
>>> re.findall(r'\bs\w+', text)
['searching', 'source']
对于小写和大写s
使用:r'\b[sS]\w+'
我知道这不是正则表达式解决方案,但您可以使用startswith
>>> text="I was searching my source to make a big desk yesterday."
>>> [ t for t in text.split() if t.startswith('s') ]
['searching', 'source']
如果要匹配单个字符,则不需要将其放在字符类中,因此s
与[s]
.
您要查找的是单词边界。单词边界\b
是匹配从非单词字符 ( \W
) 到单词字符 ( )的变化的锚点,\w
反之亦然。
解决方案是:
\bs\w+
此正则表达式将匹配s
之前没有单词字符的 a(也适用于字符串的开头),并且在其后至少需要一个单词字符。\w+
匹配它可以找到的所有单词字符,所以最后不需要 a \b
。
拉姆达风格:
text = 'I was searching my source to make a big desk yesterday.'
list(filter(lambda word: word[0]=='s', text.split()))
输出:
['searching', 'source']
我尝试了这个代码示例,我认为它完全符合您的要求:
import re
text = "I was searching my source to make a big desk yesterday."
m = re.findall (r'\b[s]\w+', text)
print (m)
我想在这里添加一件小事,
假设您有一行要查找以开头的单词's'
line = "someone should show something to some@gmail.com"
如果你写像这样的正则表达式,
swords = re.findall(r"\b[sS]\w+", line)
输出将是,
['someone','should','show','something','some']
但是如果将正则表达式修改为,
# use \S instead of \w
swords = re.findall(r"\b[sS]\S+", line)
输出将是,
['someone','should','show','something','some@gmail.com']