我正在尝试过滤以下句子
'I'm using C++ in high-tech applications!', said peter (in a confident way)
成其词得到
I'm using C++ in high-tech applications said peter in a confident way
到目前为止我所拥有的是
parsing=re.findall(r"\w+(?:[-']\w+)*|'|[-.(]+|\S\w*",text)
' '.join(w for w in parsing if w not in string.punctuation)
然而这会产生
I'm using C in high-tech applications said peter in a confident way
所以'C++'错误地变成'C',因为'+'在string.punctuation中。无论如何我可以修改正则表达式代码以允许“+”不被标记?任何获得所需输出的替代方法也将受到欢迎,谢谢!