我的任务的目的是在标点符号前后添加空格。目前我一直在使用迭代str.replace()
将每个标点符号替换p
为" "+p+" "
. 我如何str.translate()
通过传入两个列表或字典来实现相同的输出:
inlist = string.punctuation
outlist = [" "+p+" " for p in string.punctuation]
inoutdict = {p:" "+p+" " for p in string.punctuation}
让我们假设我所有的标点符号都在string.punctuation
. 目前,我正在这样做:
from string import punctuation as punct
def punct_tokenize(text):
for ch in text:
if ch in deupunct:
text = text.replace(ch, " "+ch+" ")
return " ".join(text.split())
sent = "This's a foo-bar sentences with many, many punctuation."
print punct_tokenize(sent)
这个迭代str.replace()
也花费了太长时间,会str.translate()
更快吗?