0

我有一个文字Hi, my name is Will, And i am from Canada. I have 2 pets. One is a dog and the other is a Zebra. Ahoi! Thanks.

我想把这句话从.和'!'分开,我该怎么做。我也想知道句子是从哪个字符分裂而来的。

例如结果应该是:

示例 1:

Hi, my name is Will, And i am from Canada || The sentence was split with .

示例 2:

Ahoi! || The sentence was split with !

我怎样才能做到这一点?到目前为止我的工作:

print (text.split('.'))- 这只会用 打破句子.,我无法确定它用来分割的字符。

4

2 回答 2

5

您可以使用re.split()

re.split('[.!]', text)

这将拆分[...]字符类中的任何字符:

>>> import re
>>> text = 'Hi, my name is Will, And i am from Canada. I have 2 pets. One is a dog and the other is a Zebra. Ahoi! Thanks.'
>>> re.split('[.!]', text)
['Hi, my name is Will, And i am from Canada', ' I have 2 pets', ' One is a dog and the other is a Zebra', ' Ahoi', ' Thanks', '']

您可以对拆分表达式进行分组,以将字符包含在输出中单独的列表元素中:

>>> re.split('([.!])', text)
['Hi, my name is Will, And i am from Canada', '.', ' I have 2 pets', '.', ' One is a dog and the other is a Zebra', '.', ' Ahoi', '!', ' Thanks', '.', '']

要保留句子中的标点符号,请re.findall()改用:

>>> re.findall('[^.!]+?[.!]', text)
['Hi, my name is Will, And i am from Canada.', ' I have 2 pets.', ' One is a dog and the other is a Zebra.', ' Ahoi!', ' Thanks.']
于 2013-07-27T10:14:56.360 回答
0
>>> sp=re.split('(\.)|(!)','aaa.bbb!ccc!ddd.eee')
>>> sp
['aaa', '.', None, 'bbb', None, '!', 'ccc', None, '!', 'ddd', '.', None, 'eee']
>>> sp[::3] # the result
['aaa', 'bbb', 'ccc', 'ddd', 'eee']
>>> sp[1::3] # where matches `.`
['.', None, None, '.']
>>> sp[2::3] # where matches `!`
[None, '!', '!', None]
于 2013-07-27T10:20:41.150 回答