3

我在数据结构作业中遇到了一个问题,我和我的任何同事都无法弄清楚,我们甚至不知道从哪里开始!

问题表明我们应该建议对 B-Tree 进行增强;函数 order(k) - 其中 k 是 B-Tree 中的一个键 - 它将在 O(log n) 中显示该键在 B-Tree 中所有键的排序顺序中的位置。我们还需要证明“增强”不会影响 B-Tree 的常规抽象函数的复杂性。我们可以使用 O(n) 额外空间,其中 n 是 B-Tree 中的键数。

进一步解释:以具有键 ABCDEFGHIJKLM N 的 B-Tree 为例。

  • order(A) 结果应为“1”。
  • order(N) 结果应为“14”。
  • order(I) 结果应为“9”。

到目前为止我发现了什么:

  1. 鉴于我们被允许使用 O(n) 额外空间,并且 B-Tree 常规空间是 O(n),我们应该 - 可能 - 使用额外的 B-Tree 来提供帮助。

  2. 他们提到我们应该证明增强不会影响常规 B-Tree 函数的复杂性的事实,在某些时候我们必须以某种方式操作常规抽象 B-Tree 函数,而不影响它们的常规复杂性。

  3. 我们必须在 O(log n) 中 order(k) 的事实表明我们应该以基于高度的方式遍历 B-Tree,而不是逐个节点。

  4. 在某个地方,可能我们必须检查给定的 k in order(k) 是否确实存在于 B-Tree 中,我建议使用 B-Tree 的常规抽象搜索功能。

4

1 回答 1

2

在每个键上,您应该存储额外的数据来记录该节点下有多少键(包括节点本身)。

为了保持这一点,insert(k) 函数将不得不遍历新键 k 的所有祖先,并增加它们的值。这将使插入 O(log n) + O(log n),仍然是 O(log n),因此不会影响复杂性。delete(k) 必须做同样的事情,除了减少值。平衡操作也必须考虑到这一点。

然后,order(k) 将沿着树向下移动到 k:每次它移动到一个节点时,它应该将左侧有多少键的计数加到总数中,并返回这个总和。

编辑:我改变了节点和键之间“节点”的歧义,因为它们在 B 树中是不同的(一个节点可以包含多个键)。然而,该算法应该推广到大多数树数据结构。

这是B树的算法:

#In python-ish (untested psuedocode)
#root is the root of the tree
#Each node is expected to have an array named "keys",
# which contains the keys in the node.
#Each node is expected to have an array named "child_nodes",
# which contains the children of the node, if the node has children.
#If a node has children, this should be true: len(child_nodes) == len(keys) + 1

def inorder(q):
  order_count = 0
  current_node = root

  while True:
    #if q is after all keys in the node, then we will go to the last child node
    next_child_node_i = len(current_node.keys)

    #now see if q is in between any of the nodes
    #for each key-index in the keys array (ie. if the node contains 3 keys,
    # keyi will be in range [0-2] .)
    for keyi in range(len(current_node.keys)):
      #retrieve the value of the key, so we can do comparison
      current_key = current_node.keys[keyi]

      if current_key < q:
        #We are trying to find which child node to go down to next,
        # for now we will choose the child directly to the left of this key,
        #But we continue to look through the rest of the keys, to find which
        # two keys q lies in between.

        #before we continue, we should count this key in the order too:

        #if this is not a leaf node,
        if len(current_node.children) != 0:
          #retrieve the the recorded child count of the sub-tree
          order_count += current_node.children[keyi].recorded_descendant_key_count

        #add one for the key in this node that we are skipping.
        order_count += 1

        continue

      if q < current_key:
        #We found a key in the current node that is greater than q.
        #Thus we continue to the next level between this and the previous key.
        next_child_node_i = keyi
        break

      #we finally found q,
      if q == current_key:
        #now we just return the count
        return order_count

    #once we are here, we know which keys q lies between
    # (or if it belongs at the beginning or end), and thus which child to travel down to.

    #If this is a leaf node (it has no children),
    # then q was not found.
    if len(current_node.child_nodes) == 0:
      #Possible behaviors: throw exception, or just return the place in the order
      # where q *would* go, like so:
      return order

    #Travel down a level
    current_node = current_node.child_nodes[next_child_node_i]
于 2012-09-10T18:19:35.740 回答