0

我刚刚阅读了一篇关于在 python 中实现解析器的文章:http: //effbot.org/zone/simple-top-down-parsing.htm

本文描述了代码背后的一般思想:http: //mauke.hopto.org/stuff/papers/p41-pratt.pdf

在 python 中编写解析器是相当新的,所以我试图写一些类似于学习练习的东西。但是,当我尝试编写类似于文章中发现的内容时,我得到了一个TypeError: unbound method TypeError. 这是我第一次遇到这样的错误,我花了一整天的时间试图解决这个问题,但我还没有解决这个问题。这是一个存在此问题的最小代码示例(完整):

import re

class Symbol_base(object):
    """ A base class for all symbols"""
    id = None # node/token type name
    value = None #used by literals
    first = second = third = None #used by tree nodes

    def nud(self):
        """ A default implementation for nud """
        raise SyntaxError("Syntax error (%r)." % self.id)

    def led(self,left):
        """ A default implementation for led """
        raise SyntaxError("Unknown operator (%r)." % self.id)

    def __repr__(self):
        if self.id == "(name)" or self.id == "(literal)":
            return "(%s %s)" % (self.id[1:-1], self.value)
        out = [self.id, self.first, self.second, self.third]
        out = map(str, filter(None,out))
        return "(" + " ".join(out) + ")"


symbol_table = {}
def symbol(id, bindingpower=0):
    """ If a given symbol is found in the symbol_table return it.
        If the symblo cannot be found theni create the appropriate class
        and add that to the symbol_table."""
    try:
        s = symbol_table[id]
    except KeyError:
        class s(Symbol_base):
            pass
        s.__name__ = "symbol:" + id #for debugging purposes
        s.id = id
        s.lbp = bindingpower
        symbol_table[id] = s
    else:
        s.lbp = max(bindingpower,s.lbp)
    return s

def infix(id, bp):
    """ Helper function for defining the symbols for infix operations """
    def infix_led(self, left):
        self.first = left
        self.second = expression(bp)
        return self
    symbol(id, bp).led = infix_led

#define all the symbols
infix("+", 10)
symbol("(literal)").nud = lambda self: self #literal values must return the symbol itself
symbol("(end)")

token_pat = re.compile("\s*(?:(\d+)|(.))")

def tokenize(program):
    for number, operator in token_pat.findall(program):
        if number:
            symbol = symbol_table["(literal)"]
            s = symbol()
            s.value = number
            yield s
        else:
            symbol = symbol_table.get(operator)
            if not symbol:
                raise SyntaxError("Unknown operator")
            yield symbol
    symbol = symbol_table["(end)"]
    yield symbol()

def expression(rbp = 0):
    global token
    t = token
    token = next()
    left = t.nud()
    while rbp < token.lbp:
        t = token
        token = next()
        left = t.led(left)
    return left

def parse(program):
    global token, next
    next = tokenize(program).next
    token = next()
    return expression()

def __main__():
    print parse("1 + 2")

if __name__ == "__main__":
    __main__()

当我尝试用 pypy 运行它时:

Traceback (most recent call last):
  File "app_main.py", line 72, in run_toplevel
  File "parser_code_issue.py", line 93, in <module>
    __main__()
  File "parser_code_issue.py", line 90, in __main__
    print parse("1 + 2")
  File "parser_code_issue.py", line 87, in parse
    return expression()
  File "parser_code_issue.py", line 81, in expression
    left = t.led(left)
TypeError: unbound method infix_led() must be called with symbol:+ instance as first argument (got symbol:(literal) instance instead)

我猜会发生这种情况,因为我没有为infix操作创建一个实例,但我真的不想在那时创建一个实例。有什么方法可以在不创建实例的情况下更改这些方法?

非常感谢任何帮助解释为什么会发生这种情况以及我可以做些什么来修复代码!

这种行为在 python 3 中也会改变吗?

4

2 回答 2

3

您忘记在tokenize()函数中创建符号的实例;当不是数字时, yield symbol(), not symbol

else:
    symbol = symbol_table.get(operator)
    if not symbol:
        raise SyntaxError("Unknown operator")
    yield symbol()

通过这一更改,您的代码将打印:

(+ (literal 1) (literal 2))
于 2013-09-20T08:02:12.647 回答
1

您还没有将新函数绑定到对象的实例。

import types

obj = symbol(id, bp)
obj.led = types.MethodType(infix_led, obj)

请参阅另一个 SO 问题的已接受答案

于 2013-09-20T08:08:23.157 回答