您可以通过一次调用来做到这一点sub
:
big_regex = re.compile('|'.join(map(re.escape, prohibitedWords)))
the_message = big_regex.sub("repl-string", str(word[1]))
例子:
>>> import re
>>> prohibitedWords = ['Some', 'Random', 'Words']
>>> big_regex = re.compile('|'.join(map(re.escape, prohibitedWords)))
>>> the_message = big_regex.sub("<replaced>", 'this message contains Some really Random Words')
>>> the_message
'this message contains <replaced> really <replaced> <replaced>'
请注意,使用str.replace
可能会导致细微的错误:
>>> words = ['random', 'words']
>>> text = 'a sample message with random words'
>>> for word in words:
... text = text.replace(word, 'swords')
...
>>> text
'a sample message with sswords swords'
虽然使用re.sub
给出了正确的结果:
>>> big_regex = re.compile('|'.join(map(re.escape, words)))
>>> big_regex.sub("swords", 'a sample message with random words')
'a sample message with swords swords'
正如 thg435 指出的那样,如果要替换单词而不是每个子字符串,则可以将单词边界添加到正则表达式:
big_regex = re.compile(r'\b%s\b' % r'\b|\b'.join(map(re.escape, words)))
这将取代'random'
in'random words'
但不是 in 'pseudorandom words'
。