0

I'm using 32-bit FNV-1a hashing, but now I want to reserve one of the bits to hold useful information about the input key. That is, I want to use only 31 of the 32 bits for hash and 1 bit for something else.

Assuming FNV is well distributed for my application, is it safe to assume that dropping 1 bit this will increase collision rate by 32/31, as opposed to something dramatic?

The algo recommends XOR the discarded MSB with the LSB, but for 1-bit, that seems pointless. As such, would it matter which bit is discarded (MSB or LSB)? And if not, would it matter if the LSB MSB were discard after hashing each byte (i.e. using a even numbered "prime") or after 32-bit hashing the entire byte-array first.

4

1 回答 1

1

从 32 位哈希码中删除单个位将比冲突率增加 32/31 产生更大的影响。要了解原因,请注意有 2 32 个可能的 32 位散列和 2 31个可能的 31 位散列,这意味着从散列中删除一个位会将可能的散列数量减少两倍 - 显着减少可能的哈希数。这会使您在哈希中看到哈希冲突的概率大约增加一倍。

如果您的哈希值足够少,冲突很少见,那么删除一个位不太可能改变太多。但是,如果碰撞已经是一个问题,那么稍微放下一点将使您看到它们的机会大约增加一倍。

于 2022-01-15T16:31:44.103 回答