algorithm - 为什么使用双向链表删除哈希表的元素是 O(1)？

Question

在 CLRS 的教科书“Introduction to Algorithm”中，pg 有这样一段。258.

如果列表是双向链接的，我们可以在 O(1) 时间内删除一个元素。（注意，CHAINED-HASH-DELETE 将元素 x 而不是它的键 k 作为输入，这样我们就不必先搜索 x。如果哈希表支持删除，那么它的链表应该是双向链接的，这样我们可以快速删除一个项目。如果列表只是单链接的，那么要删除元素 x，我们首先必须在列表中找到 x，以便我们可以更新 x 的前任的下一个属性。对于单链接列表，既删除并且搜索将具有相同的渐近运行时间）。

令我困惑的是这个大括号，我无法理解它的逻辑。使用双向链表，仍然需要找到 x 才能删除它，这与单链表有什么不同？请帮助我理解它！

score 32 · Accepted Answer

这里提出的问题是：考虑您正在查看哈希表的特定元素。删除它的成本是多少？

假设你有一个简单的链表：

v ----> w ----> x ----> y ----> z
                |
            you're here

现在，如果您删除x，您需要连接w到y以保持您的列表链接。您需要访问w并告诉它指向y（您想要拥有w ----> y）。但是您无法访问w，x因为它只是链接！因此，您必须遍历所有列表才能w在 O(n) 操作中找到，然后告诉它链接到y. 那很糟。

然后，假设您是双重链接的：

v <---> w <---> x <---> y <---> z
                |
            you're here

w <---> y很酷，您可以从这里访问 w 和 y，因此您可以在 O(1) 操作中连接两者 ( )！

score 2 · Accepted Answer

在我看来，其中的哈希表部分主要是一个红鲱鱼。真正的问题是：“我们可以在恒定时间内从链表中删除当前元素吗？如果可以，怎么办？”

答案是：这有点棘手，但实际上是的，我们可以——至少通常是这样。我们（通常）不必遍历整个链表来查找前一个元素。相反，我们可以在当前元素和下一个元素之间交换数据，然后删除下一个元素。

一个例外是当/如果我们需要/想要删除列表中的最后一项。在这种情况下，没有要交换的下一个元素。如果你真的必须这样做，没有真正的方法可以避免找到前一个元素。然而，通常有一些方法可以避免这种情况——一种是用哨兵而不是空指针来终止列表。在这种情况下，由于我们永远不会删除具有哨兵值的节点，因此我们永远不必处理删除列表中的最后一项。这给我们留下了相对简单的代码，如下所示：

template <class key, class data>
struct node {
    key k;
    data d;
    node *next;
};

void delete_node(node *item) {
    node *temp = item->next;
    swap(item->key, temp->key);
    swap(item->data, temp->data);
    item ->next = temp->next;
    delete temp;
}

score 1 · Accepted Answer

假设您想删除一个元素 x ，通过使用双向链表，您可以轻松地将 x 的前一个元素连接到 x 的下一个元素。所以不需要遍历所有列表，它将在 O(1) 中。

score 1 · Accepted Answer

一般来说，您是正确的-您发布的算法将元素本身作为输入，而不仅仅是其键：

请注意，CHAINED-HASH-DELETE将元素 x 而不是其键 k 作为输入，因此我们不必先搜索 x。

你有元素 x - 因为它是一个双链表，你有指向前任和后继者的指针，所以你可以在 O(1) 中修复这些元素 - 使用单个链表只有后继者可用，所以你必须在 O(n) 中搜索前任。

score 0 · Accepted Answer

Find(x) is, in general, O(1) for a chained hash table -- it is immaterial whether or not you use singly linked lists or doubly linked lists. They are identical in performance.

If, after having run Find(x), you decide that you'd like to delete the object returned, you will find that, internally, a hash table might have to search for your object again. It's still usually going to be O(1) and not a big deal, but you find that you delete an awful lot, you can do a little better. Instead of returning a user's element directly, return a pointer to the underlying hash node. You can then take advantage of some internal structures. So if in this case, you chose a doubly linked list as the way to express your chaining, then during the delete process, there is no need to recompute the hash and search the collection again -- you can omit this step. You have enough information to perform a delete right from where you are sitting. Additional care must be taken if the node you are submitting is the head node, so an integer might be used to mark the location of your node in the original array if it is the head of a linked list.

The trade-off is the guaranteed space taken by the extra pointer versus a possible faster delete (and slightly more complicated code). With modern desktops, space is usually very cheap, so this might be a reasonable trade-off.

score 0 · Accepted Answer

编码观点：可以unordered_map在c++中使用来实现这一点。

unordered_map<value,node*>mp;

node*指向存储键、左右指针的结构的指针在哪里！

如何使用：

如果您有一个值v并且想要删除该节点，只需执行以下操作：

访问该节点值，如mp[v].
现在只需让它的左指针指向它右边的节点。

瞧，你完成了。

（提醒一下，在 C++unordered_map中平均需要 O(1) 才能访问存储的特定值。）

score 0 · Accepted Answer

在阅读教科书时，我也对同一主题感到困惑（“x”是指向元素的指针还是元素本身），然后最终落到了这个问题上。但经过上述讨论并再次参考教科书后，我认为书中“x”隐含假设为“节点”，其可能的属性是“key”，“next”。

教科书上的一些台词..

1)CHAINED-HASH-INSERT(T,x) 在链表 T[h( x.key )]的头部插入 x

2）如果列表只是单链接的，那么要删除元素 x，我们首先必须在列表 T[h( x.key )] 中找到 x，以便我们可以更新 x 的前任的下一个属性。

因此，我们可以假设给出了指向元素的指针，我认为 Fezvez 对所提出的问题给出了很好的解释。

score -3 · Accepted Answer

教科书错了。列表的第一个成员没有可用的“前一个”指针，因此如果它恰好是链中的第一个元素，则需要额外的代码来查找和取消链接（通常 30% 的元素是其链的头部，如果N=M，（当将 N 个项目映射到 M 个插槽时；每个插槽都有一个单独的链。））

编辑：

使用反向链接的更好方法是使用指向指向我们的链接的指针（通常是列表中前一个节点的 ->next 链接）

struct node {
   struct node **pppar;
   struct node *nxt;
   ...
   }

然后删除变为：

*(p->pppar) = p->nxt;

这种方法的一个很好的特点是它对链上的第一个节点同样有效（其 pppar 指针指向某个不属于节点的指针。

更新 2011-11-11

因为人们看不到我的意思，所以我会尝试说明。例如，有一个哈希表table（基本上是一个指针数组）和一堆节点one, two，three其中一个必须被删除。

    struct node *table[123];
    struct node *one, *two,*three;
    /* Initial situation: the chain {one,two,three}
    ** is located at slot#31 of the array */
    table[31] = one, one->next = two , two-next = three, three->next = NULL;
                one->prev = NULL, two->prev = one, three->prev = two;


    /* How to delete element one :*/
    if (one->prev == NULL) {
            table[31] = one->next;
            }
    else    {
            one->prev->next = one->next
            }
    if (one->next) {
            one->next->prev = one->prev;
            }

现在很明显，上面的代码是 O(1)，但是有一些讨厌的东西：它仍然需要array，和索引31，所以在大多数情况下，一个节点是“自包含”的，指向一个节点的指针就足以删除它从它的链中取出，除非它恰好是它的链中的第一个节点；然后需要额外的信息来查找table和31。

接下来，考虑具有指向指针的等效结构作为反向链接。

    struct node {
            struct node *next;
            struct node **ppp;
            char payload[43];
            };

    struct node *table[123];
    struct node *one, *two,*three;
    /* Initial situation: the chain {one,two,three}
    ** is located at slot#31 of the array */
    table[31] = one, one-next = two , two-next = three, three->next = NULL;
                one->ppp = &table[31], two->ppp = &one->next, three->ppp = &two-next;

    /* How to delete element one */
    *(one->ppp) = one->next;
    if (one->next) one->next->ppp = one->ppp;

注意：没有特殊情况，也不需要知道父表。（考虑存在多个哈希表但具有相同节点类型的情况：删除操作仍然需要知道应该从哪个表中删除节点）。

通常，在 {prev,next} 场景中，通过在双链表的开头添加一个虚拟节点来避免特殊情况；但这也需要分配和初始化。

algorithm - 为什么使用双向链表删除哈希表的元素是 O(1)？

8 回答 8

Related

Reference