c++ - 如何为哈希表实现擦除功能？

Question

我有一个使用线性探测的哈希表。我被赋予了erase(int key)使用以下准则编写函数的任务。

 void erase(int key);

 Preconditions:  key >= 0
 Postconditions: If a record with the specified key exists in the table, then
 that record has been removed; otherwise the table is unchanged.

我也得到了一些完成任务的提示

重要的是要意识到插入函数将允许您向表中添加新条目，或更新表中的现有条目。
对于线性探测版本，请注意插入项目的代码有两个搜索。insert() 函数调用函数 findIndex() 来搜索表以查看该项目是否已经在表中。如果项目不在表中，则进行第二次搜索以找到表中的位置以插入项目。添加删除条目的功能将需要修改插入过程。搜索现有项目时，请确保搜索不会停止，因为该位置已被占用但由于该项目已被删除而现在为空。在搜索插入新项目的位置时，请使用第一个空位置 - 该位置是否曾被占用并不重要。

所以我开始写erase(key)，我似乎遇到了提示所指的问题，但我不确定这意味着什么。我将在一秒钟内提供代码，但是我为测试我的代码所做的是设置哈希表，以便它会发生冲突，然后我删除该键并重新哈希表，但它不会进入正确的位置。

例如，我在哈希表中添加了一些元素：

The hash table is:
Index  Key    Data
    0   31     3100
    1    1     100
    2    2     200
    3   -1
    4   -1
    5   -1
    6   -1
    7   -1
    8   -1
    9   -1
   10   -1
   11   -1
   12   -1
   13   -1
   14   -1
   15   -1
   16   -1
   17   -1
   18   -1
   19   -1
   20   -1
   21   -1
   22   -1
   23   -1
   24   -1
   25   -1
   26   -1
   27   -1
   28   -1
   29   -1
   30   -1

所以我所有的值都是空的，除了前 3 个索引。显然键 31 应该进入索引 1。但是由于键 1 已经存在，它会发生冲突并解决索引 0。然后我删除键 1 并重新哈希表，但键 31 保持在索引 0。

以下是可能值得一看的功能：

void Table::insert( const RecordType& entry )
{
   bool alreadyThere;
   int index;

   assert( entry.key >= 0 );

   findIndex( entry.key, alreadyThere, index );
   if( alreadyThere )
      table[index] = entry;   
   else
   {
      assert( size( ) < CAPACITY );
      index = hash( entry.key );
      while ( table[index].key != -1 )
         index = ( index + 1 ) % CAPACITY;
      table[index] = entry;
      used++;
   }
}

由于 insert 使用 findIndex，我也将其包括在内

void Table::findIndex( int key, bool& found, int& i ) const
{
   int count = 0;

   assert( key >=0 );

   i = hash( key );
   while ( count < CAPACITY && table[i].key != -1 && table[i].key != key )
   {
      count++;
      i = (i + 1) % CAPACITY;
   }   
   found = table[i].key == key;
}

这是我目前的擦除开始

void Table::erase(int key) 
{
    assert(key >= 0);

    bool found, rehashFound;
    int index, rehashIndex;

    //check if key is in table
    findIndex(key, found, index);

    //if key is found, remove it
    if(found)
    {
        //remove key at position
        table[index].key = -1;
        table[index].data = NULL;
        cout << "Found key and removed it" << endl;
        //reduce the number of used keys
        used--;
        //rehash the table

        for(int i = 0; i < CAPACITY; i++)
        {
            if(table[i].key != -1)
            {
                cout << "Rehashing key : " << table[i].key << endl;
                findIndex(table[i].key, rehashFound, rehashIndex);
                cout << "Rehashed to index : " << rehashIndex << endl;
                table[rehashIndex].key = table[i].key;
                table[rehashIndex].data = table[i].data;
            }
        }
    }
}

有人可以解释我需要做什么才能使其正确重新散列吗？我理解哈希表的概念，但我似乎在这里做错了什么。

编辑

根据用户的建议：

void Table::erase(int key)
{
    assert(key >= 0);
    bool found;
    int index;

    findIndex(key, found, index);

    if(found) 
    {
        table[index].key = -2;
        table[index].data = NULL;
        used--;

    }

}


//modify insert(const RecordType & entry)

while(table[index].key != -1 || table[index].key != -2)


//modify findIndex

while(count < CAPACITY && table[i].key != -1
      && table[i].key != -2 && table[i].key != key)

score 2 · Accepted Answer

从表中删除项目时，不要移动任何东西。只需在此处粘贴“已删除”标记即可。在插入时，将删除标记视为空且可用于新项目。在查找时，将它们视为已被占用，并在您击中一个时继续探测。调整表格大小时，忽略标记。

请注意，如果从不调整表格大小，这可能会导致问题。如果从不调整表的大小，一段时间后，您的表将没有标记为从未使用过的条目，并且查找性能将一落千丈。由于提示提到跟踪是否曾经使用过空位置并将曾经使用过的单元格与从未使用过的单元格区别对待，我相信这是预期的解决方案。据推测，调整表格大小将是以后的任务。

score 1 · Accepted Answer

每次删除完成时都不需要重新散列整个表。如果您想最大限度地减少性能下降，那么您可以通过考虑是否有任何元素在（允许从头到尾换行）被删除元素之后但在下一个 -1 散列到被删除元素处或之前的存储桶之前元素 - 如果是这样，那么它们可以移动到或至少靠近它们的哈希桶，然后您可以对刚刚移动的元素重复压缩过程。

进行这种压缩将消除您当前代码中的最大缺陷，即在少量使用后，每个存储桶将被标记为正在使用或已使用，并且例如查找不存在值的性能将降低到O（容量）。

在没有编译器/测试的情况下离开我的头顶......

int Table::next(int index) const
{
    return (index + 1) % CAPACITY;
}

int Table::distance(int from, int to) const
{
    return from < to ? to - from : to + CAPACITY - from;
}

void Table::erase(int key)
{
    assert(key >= 0);
    bool found;
    int index;

    findIndex(key, found, index);

    if (found) 
    {
        // compaction...
        int limit = CAPACITY - 1;
        for (int compact_from = next(index);
             limit-- && table[compact_from].key >= 0;
             compact_from = next(compact_from))
        {
            int ideal = hash(table[compact_from].key);
            if (distance(ideal, index) <
                distance(ideal, compact_from))
            {
                table[index] = table[compact_from];
                index = compact_from;
            }
        }

        // deletion
        table[index].key = -1;
        delete table[index].data; // or your = NULL if not a leak? ;-.
        --used;
    }
}

c++ - 如何为哈希表实现擦除功能？

2 回答 2

Related

Reference