我通过将整个捕获正则表达式放在括号中来拆分字符串而不删除分隔符。目的是匹配以一个或多个 '[!?]' 字符结尾的句子。
一切都很好,除了我现在得到了不需要的空捕获组- 如何以最不生硬和最正则表达式的方式抑制这些?
>>> re.compile(r'([^!?]*[!?]+)').split('Great customer service! Very happy! Will go again')
['', 'Great customer service!', '', ' Very happy!', ' Will go again']
>>> re.compile(r'([^!?]{2,}[!?]+)').split('Great customer service! Very happy! Will go again')
['', 'Great customer service!', '', ' Very happy!', ' Will go again']
(这都深深嵌套在更复杂的正则表达式和子函数中,所以真的不想要黑客。我希望解决方案是正则表达式,所以我可以将它折叠成更复杂的正则表达式)