Scanner
您可以使用模块的解决此问题re
:
使用以下字符串列表作为测试:
l = ['something{now I am wrapped {I should not cause splitting} I am still wrapped}everything else',
'something{now I am wrapped} here {and there} listen',
'something{now I am wrapped {I should {not} cause splitting} I am still wrapped}everything',
'something{now {I {am}} wrapped {I should {{{not}}} cause splitting} I am still wrapped}everything']
创建一个类,我将在其中保持打开和关闭花括号数量的状态,以及它们两个边缘之间的文本。它有三种方法,一种是匹配左花括号,另一种是右花括号,最后一种是两者之间的文本。取决于堆栈(opened_cb
变量)是否为空,我会执行不同的操作:
class Cb():
def __init__(self, results=None):
self.results = []
self.opened_cb = 0
def s_text_until_cb(self, scanner, token):
if self.opened_cb == 0:
return token
else:
self.results.append(token)
return None
def s_opening_cb(self, scanner, token):
self.opened_cb += 1
if self.opened_cb == 1:
return token
self.results.append(token)
return None
def s_closing_cb(self, scanner, token):
self.opened_cb -= 1
if self.opened_cb == 0:
t = [''.join(self.results), token]
self.results.clear()
return t
else:
self.results.append(token)
return None
最后,我Scanner
在一个简单的列表中创建并加入结果:
for s in l:
results = []
cb = Cb()
scanner = re.Scanner([
(r'[^{}]+', cb.s_text_until_cb),
(r'[{]', cb.s_opening_cb),
(r'[}]', cb.s_closing_cb),
])
r = scanner.scan(s)[0]
for elem in r:
if isinstance(elem, list):
results.extend(elem)
else:
results.append(elem)
print('Original string --> {0}\nResult --> {1}\n\n'.format(s, results))
这是完整的程序和执行以查看结果:
import re
l = ['something{now I am wrapped {I should not cause splitting} I am still wrapped}everything else',
'something{now I am wrapped} here {and there} listen',
'something{now I am wrapped {I should {not} cause splitting} I am still wrapped}everything',
'something{now {I {am}} wrapped {I should {{{not}}} cause splitting} I am still wrapped}everything']
class Cb():
def __init__(self, results=None):
self.results = []
self.opened_cb = 0
def s_text_until_cb(self, scanner, token):
if self.opened_cb == 0:
return token
else:
self.results.append(token)
return None
def s_opening_cb(self, scanner, token):
self.opened_cb += 1
if self.opened_cb == 1:
return token
return None
def s_closing_cb(self, scanner, token):
self.opened_cb -= 1
if self.opened_cb == 0:
t = [''.join(self.results), token]
self.results.clear()
return t
else:
self.results.append(token)
return None
for s in l:
results = []
cb = Cb()
scanner = re.Scanner([
(r'[^{}]+', cb.s_text_until_cb),
(r'[{]', cb.s_opening_cb),
(r'[}]', cb.s_closing_cb),
])
r = scanner.scan(s)[0]
for elem in r:
if isinstance(elem, list):
results.extend(elem)
else:
results.append(elem)
print('Original string --> {0}\nResult --> {1}\n\n'.format(s, results))
像这样运行它:
python3 script.py
这会产生:
Original string --> something{now I am wrapped {I should not cause splitting} I am still wrapped}everything else
Result --> ['something', '{', 'now I am wrapped {I should not cause splitting} I am still wrapped', '}', 'everything else']
Original string --> something{now I am wrapped} here {and there} listen
Result --> ['something', '{', 'now I am wrapped', '}', ' here ', '{', 'and there', '}', ' listen']
Original string --> something{now I am wrapped {I should {not} cause splitting} I am still wrapped}everything
Result --> ['something', '{', 'now I am wrapped {I should {not} cause splitting} I am still wrapped', '}', 'everything']
Original string --> something{now {I {am}} wrapped {I should {{{not}}} cause splitting} I am still wrapped}everything
Result --> ['something', '{', 'now {I {am}} wrapped {I should {{{not}}} cause splitting} I am still wrapped', '}', 'everything']