python - 从树递归中返回值列表

Question

我正在尝试自学数据结构，并且我正在用 Python 实现一个 kd 树。我有一种方法可以在我的 kd 树类中一个点的某个半径内搜索树中的点：

def within_radius(self, point, radius, result=[]):
    """
    Find all items in the tree within radius of point
    """
    d = self.discriminator

    if in_circle(point, radius, self.data):
        result.append(self.data)

    # Check whether any of the points in the subtrees could be
    # within the circle
    if point[d] - radius < self.data[d] and self.l_child:
        result.append(self.l_child.within_radius(point, radius, result))

    if point[d] + radius > self.data[d] and self.r_child:
        result.append(self.r_child.within_radius(point, radius, result))

    return result

它可以工作，但它返回的列表非常时髦，带有来自递归调用的重复值result。将树递归返回的值“累积”到列表中的好方法是什么？我已经考虑了一段时间，但我真的不知道怎么做。

score 4 · Accepted Answer

我不确定这是否是最干净的方法，但是每当我像这样进行递归时，我经常添加一个关键字参数，它是要返回的列表。这样，当我修改列表时，我总是修改为同一个列表：

def recurse(max, _output=None):
    if _output is None:
        _output = []

    #do some work here, add to your list
    _output.append(max)

    if max <= 0: #Some condition where the recursion stops
        return _output
    else:        #recurse with new arguments so that we'll stop someday...
        return recurse(max-1, _output=_output)

这是有效的，因为当停止条件为True时，_output列表将被返回并一直传递回堆栈。

我使用带下划线的变量名来表示它只能在函数本身中使用。这是对使用下划线前缀变量的正常方式的轻微扩展（在类中表示变量是“私有的”），但我认为它明白了这一点......

请注意，这与您所拥有的并没有太大的不同。 但是，对于您的版本，result将在调用之间持续存在，因为在创建函数时result = []进行评估，而不是在调用它时进行评估。此外，您的版本正在附加返回值（即列表本身）。当您考虑包含对其自身的多个引用的列表时，这会变得非常复杂......

score 2 · Accepted Answer

我同意 mgilson 的分析。list是一个可变类型并且list.append是就地的。这是什么意思：

有两种类型：可变的和不可变的。

可变类型存在于内存中的相同位置，即使您更改它也是如此。例如，lists 和s 是可变类型。dict这意味着如果您创建一个list并以某些方式更改它，它仍将位于内存中的同一位置。所以假设你创建了一个list名为“myList”。假设这个列表在内存位置 0x9000 中。那么，doingmyList.append(0)不会改变myList内存中的位置。即使您这样做了myList[0] = 'a'，该位置也不会改变 - 它仍将位于 0x9000。

当您尝试以任何方式更改它时，不可变类型将“移动”到不同的内存位置。strs 和tuples 是不可变的。这就是您收到以下错误的原因：

>>> s = 'as'
>>> s[0] = 'b'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment

这意味着即使您定义s = 'as'（假设s现在位于内存地址 0x5000）并将其重新定义为s = 'af'，内存中的位置s也会发生变化。

现在，当您重新分配可变类型时，它在内存中的位置会发生变化。例如，

L = [1,2,3] # 说内存位置 0x4000 L = [5,6,7] # 内存位置不再是 0x4000

这就是list.append“就地”属性发挥作用的地方。“list.append就地”意味着将新元素添加到列表中而不创建新列表。这就是为什么list.append没有返回值的原因，如下所示：

>>> L = [1,2,3]
>>> ret = L.append(4)
>>> print L
[1, 2, 3, 4]
>>> print ret
None

但是，如果您想创建一个新列表，您可以按如下方式进行：

>>> L = [1,2,3]
>>> ret = L + [4]
>>> print L
[1, 2, 3]
>>> print ret
[1, 2, 3, 4]

因此，在您的情况下发生的情况是，在两个递归调用（左和右）中，point都将其附加到每个递归调用的列表中。这就是您获得重复值的原因。

您可以通过执行 mgilson 的建议来规避此问题，或者如果您是 lisp 粉丝（这是一个非常好的 lisp 问题），那么您可以使用该[1,2,3] + [4]原理并执行此操作（未经测试，但应该可以）：

def within_radius(self, point, radius, result=[]):
    """
    Find all items in the tree within radius of point
    """
    d = self.discriminator

    temp = []

    if in_circle(point, radius, self.data):
        temp = [self.data]

    # Check whether any of the points in the subtrees could be
    # within the circle
    if point[d] - radius < self.data[d] and self.l_child:
        temp += self.l_child.within_radius(point, radius, result)

    if point[d] + radius > self.data[d] and self.r_child:
        temp += self.r_child.within_radius(point, radius, result)

    return result+temp

希望这可以帮助

score 1 · Accepted Answer

这里有一些想法：

如果您只想返回唯一的结果，您可能应该使用集合并在返回时将其转换为列表。一个问题是它self.data必须是不可变的，例如元组而不是列表。
因为您正在result遍历递归并将递归调用的结果添加到它，所以您明确地将每个命中添加到结果中至少两次。通过递归将结果线程化将使您无法创建和丢弃数据结构，因此您可能就可以这样做。
正如 mgilson 指出的那样，由于 Python 处理默认参数的方式，result在函数声明中设置为空列表不会像你想的那样。每次您within_radius在没有明确传入的情况下调用时result，都会为每个调用累积点击次数，而不仅仅是单个调用。（这有意义吗？看到这个）。mgilson 的回答也指出了这一点。

考虑到所有这些，我可能会做这样的事情：

def within_radius(self, point, radius, result=None):
    d = self.discriminator

    result = set() if result is None else result

    if in_circle(point, radius, self.data):
        result.add(self.data)
    if point[d] - radius < self.data[d] and self.l_child:
        self.l_child.within_radius(point, radius, result)
    if point[d] + radius > self.data[d] and self.r_child:
        self.r_child.within_radius(point, radius, result)

    return list(result)

python - 从树递归中返回值列表

3 回答 3

Related

Reference