c# - 哈希表翻倍？

Question

我不知道标题是否有意义，但我想知道当您向其中添加项目时哈希表如何扩大？

是否像List<T>达到极限时它的大小翻倍？如果是这样，那么这种加倍是否会从头开始重新创建集合（这也可以回答List<T>，因为我不确定它是否这样做）？

最后，如果它确实从头开始重新创建它，那么这个特殊的 Add 操作对于不知道已达到限制的用户来说将是非常昂贵的，对吧？

score 5 · Accepted Answer

我相信两者Hashtable并Dictionary<TKey, TValue>在将当前计数增加一倍后扩展到下一个素数，例如从 31 到 67。

据我了解，调整大小不涉及重新计算哈希（因为它们与条目一起存储），而是涉及将每个条目放入其新存储桶中，其中存储桶编号基于哈希码和存储桶计数。

你问过List<T>- 这真的很简单。该列表由一个数组支持，您只需要创建一个具有正确大小的新数组，并复制当前数组的内容。就像是：

private void Resize(int newCapacity)
{
    T[] tmp = new T[newCapacity];
    Array.Copy(backingArray, tmp, backingArray.Length);
    backingArray = tmp;
}

score 1 · Accepted Answer

哈希表使用存储桶工作，每个存储桶可以容纳多个项目（至少在大多数实现中，有一些在已经使用的存储桶的情况下会重用其他存储桶）。桶的数量通常是一个素数，因此将哈希码除以桶的数量会返回可接受的“好”哈希分布。

通常，有一个特定的填充因子会触发更多桶的添加，从而触发哈希表的重建。由于哈希值除以桶数，因此需要根据它们的新桶索引重新分配实例，这基本上是从头开始重新创建。

对于 .NET 哈希表，您可以在某些构造函数中指定“加载因子” 。来自 MSDN：

负载因子是元素与桶的最大比率。较小的负载因子意味着以增加内存消耗为代价的更快查找。1.0 的负载因子是速度和大小之间的最佳平衡。

score 0 · Accepted Answer

大小并不总是翻倍，但会根据项目的数量而变化。

对于一个列表，这并不像重新创建一个字符串或数组那样昂贵，因为只需要将指针从一个列表复制到另一个列表，并且可以非常有效地完成。

对于哈希表/字典，必须重新分配项目，这可能非常昂贵。最好提前用估计的大小初始化哈希表。

score 0 · Accepted Answer

从 MSDN 页面上Hashtable.Add()：

如果 Count 小于 Hashtable 的容量，则此方法是 O(1) 操作。如果需要增加容量以容纳新元素，则此方法变为 O(n) 操作，其中 n 为 Count。

由于 List 具有相同的注释，我假设它们在内存分配方面的内部工作方式相似。

score 0 · Accepted Answer

如果有兴趣，为什么不深入研究反射器做一些研究：

private void Insert(object key, object nvalue, bool add)
{
    uint num;
    uint num2;
    if (key == null)
    {
        throw new ArgumentNullException("key", Environment.GetResourceString("ArgumentNull_Key"));
    }
    if (this.count >= this.loadsize)
    {
        this.expand();
    }
    else if ((this.occupancy > this.loadsize) && (this.count > 100))
    {
        this.rehash();
    }
    uint num3 = this.InitHash(key, this.buckets.Length, out num, out num2);
    int num4 = 0;
    int index = -1;
    int num6 = (int) (num % this.buckets.Length);
Label_0071:
    if (((index == -1) && (this.buckets[num6].key == this.buckets)) && (this.buckets[num6].hash_coll < 0))
    {
        index = num6;
    }
    if ((this.buckets[num6].key == null) || ((this.buckets[num6].key == this.buckets) && ((this.buckets[num6].hash_coll & 0x80000000L) == 0L)))
    {
        if (index != -1)
        {
            num6 = index;
        }
        Thread.BeginCriticalRegion();
        this.isWriterInProgress = true;
        this.buckets[num6].val = nvalue;
        this.buckets[num6].key = key;
        this.buckets[num6].hash_coll |= (int) num3;
        this.count++;
        this.UpdateVersion();
        this.isWriterInProgress = false;
        Thread.EndCriticalRegion();
    }
    else if (((this.buckets[num6].hash_coll & 0x7fffffff) == num3) && this.KeyEquals(this.buckets[num6].key, key))
    {
        if (add)
        {
            throw new ArgumentException(Environment.GetResourceString("Argument_AddingDuplicate__", new object[] { this.buckets[num6].key, key }));
        }
        Thread.BeginCriticalRegion();
        this.isWriterInProgress = true;
        this.buckets[num6].val = nvalue;
        this.UpdateVersion();
        this.isWriterInProgress = false;
        Thread.EndCriticalRegion();
    }
    else
    {
        if ((index == -1) && (this.buckets[num6].hash_coll >= 0))
        {
            this.buckets[num6].hash_coll |= -2147483648;
            this.occupancy++;
        }
        num6 = (int) ((num6 + num2) % ((ulong) this.buckets.Length));
        if (++num4 < this.buckets.Length)
        {
            goto Label_0071;
        }
        if (index == -1)
        {
            throw new InvalidOperationException(Environment.GetResourceString("InvalidOperation_HashInsertFailed"));
        }
        Thread.BeginCriticalRegion();
        this.isWriterInProgress = true;
        this.buckets[index].val = nvalue;
        this.buckets[index].key = key;
        this.buckets[index].hash_coll |= (int) num3;
        this.count++;
        this.UpdateVersion();
        this.isWriterInProgress = false;
        Thread.EndCriticalRegion();
    }
}

score 0 · Accepted Answer

当然，这完全取决于您的哈希实现。

一些散列加倍，一些将它们的大小更改为其他任意大小（例如，下一个素数）。

大多数哈希在更改其缓冲区大小后都需要重新哈希，这“只是”移动指针，但仍然与哈希大小成线性关系。但是，一些哈希使用一致的哈希，这减少了移动元素的需要（通常，只有一小部分元素需要移动）。

c# - 哈希表翻倍？

6 回答 6

Related

Reference