我使用哈希表(DotNET Dictionary 对象)作为稀疏二维数据集的一部分。哈希表中的大多数条目将靠近在一起。我最终可能会得到 100 ~ 10,000 个条目,所有条目都聚集在零附近。我读过当哈希分布在整个整数(32 位)范围内时,哈希表的性能更好。
有没有一种廉价的方法可以将连续整数以 1:1 的方式映射到完全不同的值上?我不必将它们映射回来,这纯粹是单向的。
也许我误解了你在说什么,但 Dictionary 已经对你的整数进行了哈希处理。不需要对它们进行预散列。为什么不尝试默认实现并看看它是如何进行的,而不是尝试一个很可能毫无意义的预优化。
If you know the maximum value of your keyset (kmax), you could expand by a constant factor (multiplier), say multiply by a fixed prime number that keeps the product below the max integer size (2^31 - 1):
i.e. the nearest prime number to (2^30) / kmax
Note: make sure the prime used is not the same as the number of buckets in the Hash table.
Here is another solution: Since the .NET Random class will generate the same value for the same seed, you could use that to distribute the incoming keys.
Instead of using Integer, write a class that Inherits from Integer, and override the GetHashCode function. This way you don't have to do anything but create this function!
The easiest way I can think of to spread out the values evenly is to do something like:
public class MyInteger:Integer
{
public override int GetHashCode()
{
unchecked
{
return (int)Math.Pow(this,this);
}
}
}
Nice and evenly split up, while keeping the effort to a minimum.