0

我使用哈希表(DotNET Dictionary 对象)作为稀疏二维数据集的一部分。哈希表中的大多数条目将靠近在一起。我最终可能会得到 100 ~ 10,000 个条目,所有条目都聚集在零附近。我读过当哈希分布在整个整数(32 位)范围内时,哈希表的性能更好。

有没有一种廉价的方法可以将连续整数以 1:1 的方式映射到完全不同的值上?我不必将它们映射回来,这纯粹是单向的。

4

3 回答 3

3

也许我误解了你在说什么,但 Dictionary 已经对你的整数进行了哈希处理。不需要对它们进行预散列。为什么不尝试默认实现并看看它是如何进行的,而不是尝试一个很可能毫无意义的预优化。

于 2009-09-19T06:03:51.207 回答
1

If you know the maximum value of your keyset (kmax), you could expand by a constant factor (multiplier), say multiply by a fixed prime number that keeps the product below the max integer size (2^31 - 1):

i.e. the nearest prime number to (2^30) / kmax

Note: make sure the prime used is not the same as the number of buckets in the Hash table.

Here is another solution: Since the .NET Random class will generate the same value for the same seed, you could use that to distribute the incoming keys.

于 2009-09-19T05:17:23.827 回答
1

Instead of using Integer, write a class that Inherits from Integer, and override the GetHashCode function. This way you don't have to do anything but create this function!

The easiest way I can think of to spread out the values evenly is to do something like:

public class MyInteger:Integer
{
    public override int GetHashCode()
    {
       unchecked
       {
           return (int)Math.Pow(this,this);
       }
    }
}

Nice and evenly split up, while keeping the effort to a minimum.

于 2009-09-19T05:24:29.397 回答