3

这应该是使用该re库的一项非常简单的任务。但是,我似乎无法在分隔符][.

我已经阅读了在 Python 中使用多个分隔符拆分字符串、Python:使用多个分隔符拆分字符串Python:如何在方括号内获取多个元素

我的字符串:

data = "This is a string spanning over multiple lines.
        At somepoint there will be square brackets.

        [like this]

        And then maybe some more text.

        [And another text in square brackets]"

它应该返回:

['This is a string spanning over multiple lines.\nAt somepoint there will be square brackets.','like this', 'And then maybe some more text.', 'And another text in square brackets']

一个简短的例子:

data2 = 'A new string. [with brackets] another line [and a bracket]'

我试过了:

re.split(r'(\[|\])', data2)
re.split(r'([|])', data2)

但这些会导致在我的结果列表中包含分隔符或完全错误的列表:

['A new string. ', '[', 'with brackets', ']', ' another line ', '[', 'and a bracket', ']', '']

结果应该是:

['A new string.', 'with brackets', 'another line', 'and a bracket']

作为特殊要求,应删除分隔符前后的所有换行符和空格,也不应包含在列表中。

4

4 回答 4

7
>>> re.split(r'\[|\]', data2)
['A new string. ', 'with brackets', ' another line ', 'and a bracket', '']
于 2013-06-11T16:57:53.413 回答
5

正如 arshajii 指出的那样,对于这个特定的正则表达式,您根本不需要组。

如果您确实需要组来表达更复杂的正则表达式,您可以使用非捕获组来拆分而不捕获分隔符。它可能对其他情况有用,但在这里语法混乱过度。

(?:...)

A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.

http://docs.python.org/2/library/re.html

因此,这里不必要的复杂但具有示范性的示例是:

re.split(r'(?:\[|\])', data2)
于 2013-06-11T16:56:57.840 回答
2

改用它(没有捕获组):

re.split(r'\s*\[|]\s*', data)

或更短:

re.split(r'\s*[][]\s*', data)
于 2013-06-11T16:57:28.343 回答
0

可以拆分或全部查找,例如:

data2 = 'A new string. [with brackets] another line [and a bracket]'

使用拆分并过滤掉前导/尾随空格:

import re
print filter(None, re.split(r'\s*[\[\]]\s*', data2))
# ['A new string.', 'with brackets', 'another line', 'and a bracket']

或者可能,采用 findall 方法:

print re.findall(r'[^\b\[\]]+', data2)
# ['A new string. ', 'with brackets', ' another line ', 'and a bracket'] # needs a little work on leading/trailing stuff...
于 2013-06-11T17:00:50.183 回答