6

I have seen many different implementations of BK Trees in many different languages, and literally none of them seem to include a way to remove nodes from the tree.

Even the original article where BK Trees were first introduced does not provide a meaningful insight about node deletion, as the authors merely suggest to mark the node to be deleted so that it is ignored:

The deletion of a key in Structures 1 [the BK Tree] and 2 follows a process similar to that above, with special consideration for the case in which the key to be deleted is the representative x° [root key]. In this case, the key cannot simply be deleted, as it is essential for the structure information. Instead an extra bit must be used for each key which denotes whether the key actually corresponds to a record or not. The search algorithm is modified correspondingly to ignore keys which do not correspond to records. This involves testing the extra bit in the Update procedure.

While it may be theoretically possible to properly delete a node in a BK Tree, is it possible to do so in linear/sublinear time?

4

1 回答 1

2

虽然理论上可以正确删除 BK 树中的节点,但是否可以在线性/亚线性时间内这样做?

如果您想从 BK-Tree 物理上删除它,那么我想不出一种方法可以在所有情况下在线性时间内完成此操作。考虑2删除节点时的场景。请注意,我没有考虑与计算 Levenshtein 距离相关的时间复杂度,因为该操作不依赖于字数,尽管它也需要一些处理时间。

移除非根节点

  1. 在树中找到节点的父节点。
  2. 保存节点的子节点。
  3. 取消父节点对节点的引用。
  4. 重新添加每个子节点,就好像它是一个新节点一样。

在这里,即使 step1可以在 O(1) 中完成,步骤24成本也要高得多。插入单个节点是 O(h),其中 h 是树的高度。更糟糕的是,必须对原始节点的每个子节点都进行此操作,因此将是 O(k*h),其中 k 是子节点的数量。

移除根节点

  1. 从头开始重建树,而不使用先前的根节点。

重建一棵树在最好的情况下至少为 O(n),否则为 O(h*n)。

替代解决方案

这就是为什么最好不要物理删除节点,而是将其保存在树中并将其标记为deleted. 这样,它将像以前一样用于插入新节点,但会从拼写错误的单词的建议结果中排除。这可以在 O(1) 中完成。

于 2021-04-14T22:25:32.470 回答