python-3.x - Python3 re.split() 通过不在特殊子字符串中的字符

Question

我正在尝试解析和验证语言。我想标记输入以检查语法。我的输入字符串是：

something > 0 AND (something CONTAINS "substr" OR NOT something)

如果我这样做：

tokens = re.split(r"([\s()])", input)

我懂了：

['something', ' ', '>', ' ', '0', ' ', 'AND', ' ', '', '(', 'something', ' ', 'CONTAINS','   ', '"substr"', ' ', 'OR', ' ', 'NOT', ' ', 'something', ')', '']

这正是我想要的。但是，总有“东西”。如果我用“substr with whitespace”替换“ substr” ，我得到了这个数组，这不是完美的结果：

['"substr', ' ', 'with', ' ', 'whitespace"']

有什么办法可以拆分为跟随？

['"substr with whitespace"']

或者如何有效地修复这种“如此接近的分裂”？或者也许我错过了一些不同的东西......

score 0 · Accepted Answer

0

只是想分裂

re.split(r"\s*(NOT|AND|OR|\(|\)|CONTAINS|<|>|=)\s*", input)

解决了我的问题

于 2013-04-05T10:59:32.627 回答

python-3.x - Python3 re.split() 通过不在特殊子字符串中的字符

1 回答 1

Related

Reference