14

我想使用 pyparsing 来解析形式的表达式:expr = '(gimme [some {nested [lists]}])',并取回形式的 python 列表:[[['gimme', ['some', ['nested', ['lists']]]]]]。现在我的语法看起来像这样:

nestedParens = nestedExpr('(', ')')
nestedBrackets = nestedExpr('[', ']')
nestedCurlies = nestedExpr('{', '}')
封闭 = nestedParens | 嵌套括号 | 嵌套卷曲

目前, enclosed.searchString(expr)返回表单列表:[[['gimme', ['some', '{nested', '[lists]}']]]]. 这不是我想要的,因为它无法识别方括号或大括号,但我不知道为什么。

4

2 回答 2

27

这是一个 pyparsing 解决方案,它使用自修改语法来动态匹配正确的右大括号字符。

from pyparsing import *

data = '(gimme [some {nested, nested [lists]}])'

opening = oneOf("( { [")
nonBracePrintables = ''.join(c for c in printables if c not in '(){}[]')
closingFor = dict(zip("({[",")}]"))
closing = Forward()
# initialize closing with an expression
closing << NoMatch()
closingStack = []
def pushClosing(t):
    closingStack.append(closing.expr)
    closing << Literal( closingFor[t[0]] )
def popClosing():
    closing << closingStack.pop()
opening.setParseAction(pushClosing)
closing.setParseAction(popClosing)

matchedNesting = nestedExpr( opening, closing, Word(alphas) | Word(nonBracePrintables) )

print matchedNesting.parseString(data).asList()

印刷:

[['gimme', ['some', ['nested', ',', 'nested', ['lists']]]]]

更新:我发布了上述解决方案,因为我实际上是在一年前写的,作为一个实验。我只是仔细查看了您的原始帖子,它让我想到了该operatorPrecedence方法创建的递归类型定义,因此我使用您的原始方法重新编写了这个解决方案 - 更容易遵循!(虽然右输入数据可能存在左递归问题,但未经彻底测试):

from pyparsing import *

enclosed = Forward()
nestedParens = nestedExpr('(', ')', content=enclosed) 
nestedBrackets = nestedExpr('[', ']', content=enclosed) 
nestedCurlies = nestedExpr('{', '}', content=enclosed) 
enclosed << (Word(alphas) | ',' | nestedParens | nestedBrackets | nestedCurlies)


data = '(gimme [some {nested, nested [lists]}])' 

print enclosed.parseString(data).asList()

给出:

[['gimme', ['some', ['nested', ',', 'nested', ['lists']]]]]

编辑:这是更新解析器的图表,使用 pyparsing 3.0 中的铁路图表支持。 铁路图

于 2011-01-26T06:50:33.387 回答
-3

这应该为您解决问题。我在您的示例中对其进行了测试:

import re
import ast

def parse(s):
    s = re.sub("[\{\(\[]", '[', s)
    s = re.sub("[\}\)\]]", ']', s)
    answer = ''
    for i,char in enumerate(s):
        if char == '[':
            answer += char + "'"
        elif char == '[':
            answer += "'" + char + "'"
        elif char == ']':
            answer += char
        else:
            answer += char
            if s[i+1] in '[]':
                answer += "', "
    ast.literal_eval("s=%s" %answer)
    return s

评论如果您需要更多

于 2011-01-26T05:01:11.117 回答