c++ - 将静态哈希表转换为动态哈希表

Question

我被分配将具有静态内存分配的给定哈希表更改为动态，以便在程序运行时可以分配超过限制的更多内存。我绝不要求解决这个问题，我只是问是否有人知道一个好的起点或者我需要注意代码的哪些方面，因为我对哈希表有点迷茫和困惑。我知道枚举和构造函数需要更改，但我不确定其他很多。这是给出的代码，并提前感谢您的任何建议：

#ifndef TABLE1_H
#define TABLE1_H
#include <cstdlib>    // Provides size_t
#include <cassert>  // Provides assert

namespace main_savitch_12A
{
template <class RecordType>
class table
{
public:

enum { CAPACITY = 30 }; 
    // CONSTRUCTOR
    table( );
    // MODIFICATION MEMBER FUNCTIONS
    void insert(const RecordType& entry);
    void remove(int key);
    // CONSTANT MEMBER FUNCTIONS
    bool is_present(int key) const;
    void find(int key, bool& found, RecordType& result) const;
    size_t size( ) const { return used; }
private:
    // MEMBER CONSTANTS -- These are used in the key field of special records.
    enum { NEVER_USED = -1 };
    enum { PREVIOUSLY_USED = -2 };
    // MEMBER VARIABLES
    RecordType data[CAPACITY];
    size_t used;
    // HELPER FUNCTIONS
    size_t hash(int key) const;
    size_t next_index(size_t index) const;
    void find_index(int key, bool& found, size_t& index) const;
    bool never_used(size_t index) const;
    bool is_vacant(size_t index) const;
};

    template <class RecordType>
table<RecordType>::table( )
{
    size_t i;

    used = 0;
    for (i = 0; i < CAPACITY; ++i)
        data[i].key = NEVER_USED;  // Indicates a spot that's never been used.
}

template <class RecordType>
void table<RecordType>::insert(const RecordType& entry)
// Library facilities used: cassert
{
    bool already_present;   // True if entry.key is already in the table
    size_t index;        // data[index] is location for the new entry

    assert(entry.key >= 0);

    // Set index so that data[index] is the spot to place the new entry.
    find_index(entry.key, already_present, index);

    // If the key wasn't already there, then find the location for the new entry.
    if (!already_present)
    {
        assert(size( ) < CAPACITY);
        index = hash(entry.key);
        while (!is_vacant(index))
            index = next_index(index);
        ++used;
    }

    data[index] = entry;
    size_t i;
    for (i=0; i<CAPACITY; i++) cout << data[i].key << ' ';
    cout << endl;
}

template <class RecordType>
void table<RecordType>::remove(int key)
// Library facilities used: cassert
{
    bool found;        // True if key occurs somewhere in the table
    size_t index;   // Spot where data[index].key == key

    assert(key >= 0);

    find_index(key, found, index);
    if (found)
    {   // The key was found, so remove this record and reduce used by 1.
        data[index].key = PREVIOUSLY_USED; // Indicates a spot that's no longer in use.
        --used;
    }
}

template <class RecordType>
bool table<RecordType>::is_present(int key) const
// Library facilities used: assert.h
{
    bool found;
    size_t index;

    assert(key >= 0);

    find_index(key, found, index);
    return found;
}

template <class RecordType>
void table<RecordType>::find(int key, bool& found, RecordType& result) const
// Library facilities used: cassert.h
{
    size_t index;

    assert(key >= 0);

    find_index(key, found, index);
    if (found)
        result = data[index];
}

template <class RecordType>
inline size_t table<RecordType>::hash(int key) const
{
    return (key % CAPACITY);
}

template <class RecordType>
inline size_t table<RecordType>::next_index(size_t index) const
// Library facilities used: cstdlib
{
    return ((index+1) % CAPACITY);
}

template <class RecordType>
void table<RecordType>::find_index(int key, bool& found, size_t& i) const
// Library facilities used: cstdlib
{
size_t count; // Number of entries that have been examined

count = 0;
i = hash(key);
while((count < CAPACITY) && (data[i].key != NEVER_USED) && (data[i].key != key))
{
    ++count;
    i = next_index(i);
}
found = (data[i].key == key);
}

template <class RecordType>
inline bool table<RecordType>::never_used(size_t index) const
{
return (data[index].key == NEVER_USED);
}

template <class RecordType>
inline bool table<RecordType>::is_vacant(size_t index) const
{
return (data[index].key == NEVER_USED);// || (data[index].key == PREVIOUSLY_USED);
}
}

#endif

score 1 · Accepted Answer

需要思考的几点：

您可以使用vector而不是 C 样式的数组来保存元素，因为它允许动态调整大小。
当您需要扩大表时，您必须重新散列所有现有元素以将它们放在新容器中的新位置（然后在重新散列完成后，您可以与现有容器交换）。
您将希望能够指定决定何时增长的负载因子。
您需要决定是否要允许容器缩小分配的空间，再次在某个阈值处。

score 0 · Accepted Answer

@Mark B 的想法就是答案。

想补充：
建议你的表大小CAPACITY是素数。用质数修改弱散列函数hash(key)有助于分散。（一个好的散列函数不需要帮助。）

您的增长步骤通常是指数级的，可以构建到查找表中。不同的作者提出了 1.5 和 4 之间的比率。例如 Grow2x[] = { 0, 1, 3, 7, 13, 31, 61, ... (素数刚好在 2 的幂次以下};

假设您每次增长 2 倍，并且您的增长负载为 100%。那么你的收缩负荷应该是 70% 左右（100%/2x 和 100% 的几何平均值）。如果您的插入/删除徘徊在临界水平，您希望避免增长/缩小。

c++ - 将静态哈希表转换为动态哈希表

2 回答 2

Related

Reference