python - 用 Python 编写解释器。isinstance 被认为是有害的吗？

Question

我正在将我从 Scala 创建的特定领域语言的解释器移植到 Python。在这个过程中，我试图找到一种方式来模拟我广泛使用的 Scala 的案例类功能。最后我求助于使用 isinstance，但感觉我可能错过了一些东西。

诸如此类攻击使用 isinstance 的文章让我想知道是否有更好的方法来解决我的问题，而不涉及一些基本的重写。

我已经建立了许多 Python 类，每个类代表不同类型的抽象语法树节点，例如 For、While、Break、Return、Statement 等

Scala 允许像这样处理运算符评估：

case EOp("==",EInt(l),EInt(r)) => EBool(l==r)
case EOp("==",EBool(l),EBool(r)) => EBool(l==r)

到目前为止，对于 Python 的移植，我已经广泛使用 elif 块和 isinstance 调用来实现相同的效果，但更加冗长且不符合 Python 风格。有没有更好的办法？

score 2 · Accepted Answer

摘要：这是编写编译器的常用方法，在这里也可以。

在其他语言中处理此问题的一种非常常见的方法是通过“模式匹配”，这正是您所描述的。我希望这是caseScala 中该语句的名称。它是编写编程语言实现和工具的一个非常常见的习惯用法：编译器、解释器等。为什么它这么好？因为实现与数据完全分离（这通常很糟糕，但在编译器中通常是可取的）。

那么问题在于，这种用于编程语言实现的常见习语在 Python 中是一种反模式。哦哦。正如您可能知道的那样，这更像是一个政治问题而不是语言问题。如果其他 Pythonistas 看到代码，他们会尖叫；如果其他语言实现者看到它，他们会立即理解它。

这是 Python 中的反模式的原因是因为 Python 鼓励鸭子类型的接口：您不应该有基于类型的行为，而是应该由对象在运行时可用的方法定义它们。如果您希望它是惯用的 Python，S. Lott 的答案可以正常工作，但它增加的很少。

我怀疑您的设计并不是真正的鸭子类型 - 毕竟它是一个编译器，并且使用名称定义的具有静态结构的类非常普遍。如果您愿意，您可以将您的对象视为具有“类型”字段，并isinstance用于基于该类型进行模式匹配。

附录：

模式匹配可能是人们喜欢用函数式语言编写编译器等的第一大原因。

score 2 · Accepted Answer

在 python 中有一个经验法则，如果你发现自己编写了一大块 if/elif 语句，并且条件相似（例如一堆 isinstance(...)），那么你可能以错误的方式解决了问题。

更好的方法涉及使用类和多态性、访问者模式、dict 查找等。在您的情况下，为不同类型创建具有重载的 Operators 类可以工作（如上所述），具有 (type, operator) 项的 dict 也可以。

score 2 · Accepted Answer

是的。

而不是实例，只需使用Polymorphism。它更简单。

class Node( object ):
    def eval( self, context ):
        raise NotImplementedError

class Add( object ):
    def eval( self, context ):
        return self.arg1.eval( context ) + self.arg2.eval( context )

这种 this 非常简单，从不需要isinstance.

像这样需要强制的事情呢？

Add( Double(this), Integer(that) )

这仍然是一个多态性问题。

class MyType( object ):
    rank= None
    def coerce( self, another ):
        return NotImplemented

class Double( object ):
    rank = 2
    def coerce( self, another ):
        return another.toDouble()
    def toDouble( self ):
        return self
    def toInteger( self ):
        return int(self)

class Integer( object ):
    rank = 1
    def coerce( self, another ):
        return another.toInteger() 
    def toDouble( self ):
        return float(self)
    def toInteger( self ): 
        return self

 class Operation( Node ):
    def conform( self, another ):
        if self.rank > another.rank:
            this, that = self, self.coerce( another )
        else:
            this, that = another.coerce( self ), another
        return this, that
    def add( self, another ):
        this, that = self.coerce( another )
        return this + that

score 1 · Accepted Answer

文章不攻击isinstance。它攻击了让你的代码测试特定类的想法。

是的，有更好的方法。或者几个。例如，您可以将类型的处理变成一个函数，然后通过查找每个类型来找到正确的函数。像这样：

def int_function(value):
   # Do what you mean to do here

def str_function(value):
   # Do what you mean to do here

type_function = {int: int_function, str: str_function, etc, etc}

def handle_value(value):
   function = type_function[type(value)]
   result = function(value)
   print "Oh, lovely", result

如果你不想自己做这个注册，你可以看看 Zope Component Architecture，它通过接口和适配器来处理这个，它真的很酷。但这可能是矫枉过正。

更好的是，如果你能以某种方式避免进行任何类型的类型检查，但这可能会很棘手。

score 0 · Accepted Answer

在我使用 Python 3 编写的 DSL 中，我使用了复合设计模式，因此节点在使用时都是多态的，正如 S. Lott 所推荐的那样。

但是，当我首先在输入中读取以创建这些节点时，我确实使用了很多 isinstance 检查（针对 Python 3 提供的抽象基类，如 collections.Iterable 等，以及在 2.6 中也是如此我相信），以及检查 hasattr，'__call__'因为我的输入中允许调用可调用对象。这是我发现的最干净的方法（特别是涉及递归），而不是仅仅尝试对输入进行操作并捕获异常，这是我想到的替代方法。当输入无效时，我自己提出了自定义异常，以提供尽可能多的精确故障信息。

使用 isinstance 进行此类测试比使用 type() 更通用，因为 isinstance 将捕获子类 - 如果您可以针对抽象基类进行测试，那就更好了。有关抽象基类的信息，请参见http://www.python.org/dev/peps/pep-3119/。

score 0 · Accepted Answer

在这种特殊情况下，您似乎正在实现的是一个运算符重载系统，它使用对象的类型作为您打算调用的运算符的选择机制。您的节点类型恰好与您的语言类型直接对应，但实际上您正在编写解释器。节点的类型只是一条数据。

我不知道人们是否可以将自己的类型添加到您的领域特定语言中。但无论如何我都会推荐一个表格驱动的设计。

制作一个包含 (binary_operator, type1, type2, result_type, evalfunc) 的数据表。使用 isinstance 在该表中搜索匹配项，并有一些条件来选择某些匹配项而不是其他匹配项。可以使用比表格更复杂的数据结构来加快搜索速度，但是现在您基本上是在使用长长的 ifelse 语句列表来进行线性搜索，所以我打赌一个普通的旧表格会比你现在做的稍微快一点。

我不认为 isinstance 在这里是错误的选择，主要是因为类型只是您的解释器用来做出决定的一段数据。双重调度和其他类似的技术只会掩盖你的程序正在做什么的真正内容。

Python 中的一个巧妙之处在于，由于运算符函数和类型都是一流的对象，因此您可以直接将它们填充到表（或您选择的任何数据结构）中。

score -1 · Accepted Answer

如果您需要参数上的多态性（除了接收器），例如按照您的示例建议使用二元运算符处理类型转换，您可以使用以下技巧：

class EValue(object):

    def __init__(self, v):
        self.value = v

    def __str__(self):
        return str(self.value)

    def opequal(self, r):
        r.opequal_value(self)

    def opequal_int(self, l):
        print "(int)", l, "==", "(value)", self

    def opequal_bool(self, l):
        print "(bool)", l, "==", "(value)", self

    def opequal_value(self, l):
        print "(value)", l, "==", "(value)", self


class EInt(EValue):

    def opequal(self, r):
        r.opequal_int(self)

    def opequal_int(self, l):
        print "(int)", l, "==", "(int)", self

    def opequal_bool(self, l):
        print "(bool)", l, "==", "(int)", self

    def opequal_value(self, l):
        print "(value)", l, "==", "(int)", self

class EBool(EValue):

    def opequal(self, r):
        r.opequal_bool(self)

    def opequal_int(self, l):
        print "(int)", l, "==", "(bool)", self

    def opequal_bool(self, l):
        print "(bool)", l, "==", "(bool)", self

    def opequal_value(self, l):
        print "(value)", l, "==", "(bool)", self


if __name__ == "__main__":

    v1 = EBool("true")
    v2 = EInt(5)
    v1.opequal(v2)

python - 用 Python 编写解释器。isinstance 被认为是有害的吗？

7 回答 7

Related

Reference