我在字节数组上使用 murmur2 散列,但我只想散列字节的子集,murmur2 只允许我从 0 开始散列数组,但我想指定非 0 开始偏移量以及结束偏移量大批。
*
* @param data byte array to hash
* @param length length of the array to hash
* @param seed initial seed value
* @return 32 bit hash of the given array
*/
public static int hash32(final byte[] data, int length, int seed) {
// 'm' and 'r' are mixing constants generated offline.
// They're not really 'magic', they just happen to work well.
final int m = 0x5bd1e995;
final int r = 24;
// Initialize the hash to a random value
int h = seed^length;
int length4 = length/4;
for (int i=0; i<length4; i++) {
final int i4 = i*4;
int k = (data[i4+0]&0xff) +((data[i4+1]&0xff)<<8)
+((data[i4+2]&0xff)<<16) +((data[i4+3]&0xff)<<24);
k *= m;
k ^= k >>> r;
k *= m;
h *= m;
h ^= k;
}
// Handle the last few bytes of the input array
switch (length%4) {
case 3: h ^= (data[(length&~3) +2]&0xff) << 16;
case 2: h ^= (data[(length&~3) +1]&0xff) << 8;
case 1: h ^= (data[length&~3]&0xff);
h *= m;
}
h ^= h >>> 13;
h *= m;
h ^= h >>> 15;
return h;
}
我尝试了各种更改,但它总是导致我的哈希冲突测试从 0 变为非常高的数字。我不想使用 murmur3,因为它不适合像 murmur2 这样的单个小方法,在我的测试中,murmur2 也快一点。
这是我的碰撞测试仪,适用于任何想要破解它的人
HashSet<Integer> hs = new HashSet<>(100000000,(float) 1.0);
long collide = 0;
long totalLoops = 0;
byte[] ba = new byte[4];
long sTime = System.currentTimeMillis();
int hash;
for(byte d=0; d<5; d++) {
ba[0] = d;
for(byte i=-128; i<127; i++) {
ba[1] = i;
for(byte k=-128; k<127; k++) {
ba[2] = k;
for(byte j=-128; j<127; j++) {
ba[3] = j;
hash = hash32(ba,ba.length,0x9747b28c);
if(hs.contains(hash)) {
collide++;
} else {
hs.add(hash);
}
totalLoops++;
}
}
}
}
注意:上面的碰撞测试需要一台 8GB 内存的电脑。