algorithm - 查找给定总和是否存在于 BST 中的路径上

Question

问题是找出给定的和是否存在于 BST 中的任何路径上。如果路径意味着从根到叶，问题就非常简单，或者如果路径意味着从根到叶的路径的一部分可能不包括根或叶，那么问题就很简单了。但是这里变得很困难，因为路径可能同时跨越节点的左右子节点。例如，在给定的图中，圆圈路径上的总和为 132。我怎样才能找到这样一条路径的存在？使用哈希存储节点下所有可能的总和是不受欢迎的！

在此处输入图像描述

score 2 · Accepted Answer

不是最快但最简单的方法是使用两个嵌套的深度优先搜索。

使用正常的深度优先搜索来获取起始节点。使用深度优先搜索的第二个修改版本来检查从该节点开始的所有路径的总和。

在此处输入图像描述

第二深度优先搜索与普通深度优先搜索在两个细节上有所不同：

它保持当前路径总和。每次将新节点添加到路径时，它都会向总和添加值，并在删除某个节点时从总和中删除值。
它仅以相反的方向遍历从根到起始节点的路径边缘（图中的红色边缘）。像往常一样，所有其他边都以正确的方向遍历（图中的黑色边）。为了以相反的方向遍历边缘，它要么使用原始 BST 的“父”指针（如果有的话），或者窥视第一次深度优先搜索的堆栈以获得这些“父”指针。

每个 DFS 的时间复杂度为 O(N)，因此总时间复杂度为 O(N ² )。空间要求为 O(N)（两个 DFS 堆栈的空间）。如果原始 BST 包含“父”指针，则空间要求为 O(1)（“父”指针允许在没有堆栈的情况下以任何方向遍历树）。

其他方法基于 j_random_hacker 和 robert king 的想法（维护总和列表，匹配它们，然后将它们合并在一起）。它以自下而上的方式处理树（从叶子开始）。

使用 DFS 找到一些叶节点。然后回去找到最后一个分支节点，也就是这个叶子节点的grand-...-grand-parent。这给出了分支和叶节点之间的链。处理这个链：

match1(chain)
sum_list = sum(chain)

match1(chain):
  i = j = sum = 0
  loop:
    while (sum += chain[i]) < target:
      ++i
    while (sum -= chain[j]) > target:
      ++j
    if sum == target:
      success!

sum(chain):
  result = [0]
  sum = 0
  i = chain.length - 1
  loop:
    sum += chain[i]
    --i
    result.append(sum)
  return result

在此处输入图像描述

继续 DFS 并搜索其他叶子链。当找到来自同一节点的两条链时，可能在另一条链之前（图中的红色和绿色链，在蓝色链之前），处理这些链：

match2(parent, sum_list1, sum_list2)
sum_list3 = merge1(parent, sum_list1, sum_list2)

if !chain3.empty:
  match1(chain3)
  match3(sum_list3, chain3)
  sum_list4 = merge2(sum_list3, chain3)

match2(parent, sum_list1, sum_list2):
  i = 0
  j = chain2.length - 1
  sum = target - parent.value
  loop:
    while sum < sum_list1[i] + sum_list2[j]:
      ++i
    while sum > sum_list1[i] + sum_list2[j]:
      --j
    if sum == sum_list1[i] + sum_list2[j]:
      success!

merge1(parent, sum_list1, sum_list2):
  result = [0, parent.value]
  i = j = 1
  loop:
    if sum_list1[i] < sum_list2[j]:
      result.append(parent.value + sum_list1[i])
      ++i
    else:
      result.append(parent.value + sum_list2[j])
      ++j
  return result

match3(sum_list3, chain3):
  i = sum = 0
  j = sum_list3.length - 1
  loop:
    sum += chain3[i++]
    while sum_list3[j] + sum > target:
      --j
    if sum_list3[j] + sum == target:
      success!

merge2(sum_list3, chain3):
  result = [0]
  sum = 0
  i = chain3.length - 1
  loop:
    sum += chain3[i--]
    result.append(sum)
  result.append(sum_list3[1...] + sum)

在任何两个总和列表或一个链和一个总和列表是同一节点的后代的情况下执行相同的操作。该过程可以继续，直到保留属于根节点的单个总和列表。

score 2 · Accepted Answer

我将按顺序遍历左子树并以相反的顺序同时遍历右子树，这与归并排序的工作方式类似。每次移动使 aum 更接近的迭代器。就像合并排序一样。它的顺序 n

score 2 · Accepted Answer

您当然可以生成所有可能的路径，并在进行过程中逐渐累加。树是 BST 的事实可能会让您通过限制某些总和来节省时间，尽管我不确定这会带来渐近速度的增加。问题是使用给定节点的左孩子形成的总和不一定小于使用右孩子形成的总和，因为前一个总和的路径可能包含更多节点。以下算法适用于所有树，而不仅仅是 BST。

要生成所有可能的路径，请注意路径的最高点是特殊的：它是路径中唯一允许（尽管不是必需）将两个子项都包含在路径中的点。每条路径都包含一个唯一的最高点。因此，递归的外层应该是访问每个树节点，并生成以该节点为最高点的所有路径。

// Report whether any path whose topmost node is t sums to target.
// Recurses to examine every node under t.
EnumerateTopmost(Tree t, int target) {
    // Get a list of sums for paths containing the left child.
    // Include a 0 at the start to account for a "zero-length path" that
    // does not contain any children.  This will be in increasing order.
    a = append(0, EnumerateSums(t.left))
    // Do the same for paths containing the right child.  This needs to
    // be sorted in decreasing order.
    b = reverse(append(0, EnumerateSums(t.right)))

    // "List match" to detect any pair of sums that works.
    // This is a linear-time algorithm that takes two sorted lists --
    // one increasing, the other decreasing -- and detects whether there is
    // any pair of elements (one from the first list, the other from the
    // second) that sum to a given value.  Starting at the beginning of
    // each list, we compute the current sum, and proceed to strike out any
    // elements that we know cannot be part of a satisfying pair.
    // If the sum of a[i] and b[j] is too small, then we know that a[i]
    // cannot be part of any satisfying pair, since all remaining elements
    // from b that it could be added to are at least as small as b[j], so we
    // can strike it out (which we do by advancing i by 1).  Similarly if
    // the sum of a[i] and b[j] is too big, then we know that b[j] cannot
    // be part of any satisfying pair, since all remaining elements from a
    // that b[j] could be added to are at least as big as a[i], so we can
    // strike it out (which we do by advancing j by 1).  If we get to the
    // end of either list without finding the right sum, there can be
    // no satisfying pair.
    i = 0
    j = 0
    while (i < length(a) and j < length(b)) {
        if (a[i] + b[j] + t.value < target) {
            i = i + 1
        } else if (a[i] + b[j] + t.value > target) {
            j = j + 1
        } else {
            print "Found!  Topmost node=", t
            return
        }
    }

    // Recurse to examine the rest of the tree.
    EnumerateTopmost(t.left)
    EnumerateTopmost(t.right)
}

// Return a list of all sums that contain t and at most one of its children,
// in increasing order.
EnumerateSums(Tree t) {
    If (t == NULL) {
        // We have been called with the "child" of a leaf node.
        return []     // Empty list
    } else {
        // Include a 0 in one of the child sum lists to stand for
        // "just node t" (arbitrarily picking left here).
        // Note that even if t is a leaf node, we still call ourselves on
        // its "children" here -- in C/C++, a special "NULL" value represents
        // these nonexistent children.
        a = append(0, EnumerateSums(t.left))
        b = EnumerateSums(t.right)
        Add t.value to each element in a
        Add t.value to each element in b
        // "Ordinary" list merge that simply combines two sorted lists
        // to produce a new sorted list, in linear time.
        c = ListMerge(a, b)
        return c
    }
}

上面的伪代码只报告路径中最顶层的节点。可以通过EnumerateSums()返回对列表(sum, goesLeft)而不是简单的总和列表来重建整个路径，其中goesLeft是一个布尔值，指示用于生成该总和的路径最初是否从父节点向左走。

上面的伪代码为每个节点计算了多次总和列表：除了为自身调用之外，还将为树中的EnumerateSums(t)每个节点调用一次。可以对每个节点的总和列表进行记忆，这样它就不会在后续调用中重新计算，~~但实际上这并不能改善渐近性：只需要 O(n) 的工作来使用普通递归，并将其更改为 O(1) 不会改变整体时间复杂度，因为任何调用产生的总和列表通常必须由调用者读取，这需要 O(n) 时间。~~编辑：正如 Evgeny Kluev 所指出的，ttEnumerateSums()~~EnumerateSums()~~ EnumerateSums()实际上表现得像归并排序，当树完全平衡时为 O(nlog n)，当它是单个路径时为 O(n^2)。因此，记忆化实际上会带来渐近的性能改进。

可以通过重新排列到一个类似迭代器的对象中来摆脱临时的总和列表，EnumerateSums()该对象会延迟执行列表合并，并且可以查询以按升序检索下一个总和。这还需要创建一个EnumerateSumsDown()执行相同操作但按降序检索总和的函数，并使用它代替reverse(append(0, EnumerateSums(t.right))). 这样做会将算法的空间复杂度降低到 O(n)，其中 n 是树中的节点数，因为每个迭代器对象都需要恒定的空间（指向左右子迭代器对象的指针，以及记录最后一个总和）并且每个树节点最多可以有一个。

score 1 · Accepted Answer

是否有任何复杂性限制？正如您所说：“如果路径意味着根到叶，则容易，或者如果路径意味着从根到叶的路径的一部分可能不包括根或叶，则容易”。您可以通过将根每次设置为不同的节点并进行 n 次搜索来减少此语句的问题。那将是一种简单的方法，不确定是否最佳。

编辑：如果树是单向的，这种东西可能会起作用（伪代码）：

findSum(tree, sum)
    if(isLeaf(tree))
        return (sum == tree->data)
    for (i = 0 to sum)
        isfound |= findSum(leftSubStree, i) && findSum(rightSubTree, sum-i)
    return isfound;

这里可能有很多错误，但希望它能澄清这个想法。

algorithm - 查找给定总和是否存在于 BST 中的路径上

4 回答 4

Related

Reference