c++ - 为布隆过滤器计算多个小整数散列的最佳方法是什么？

Question

我看过Murmur3和Meow，但在散列长数组时，它们似乎都针对带宽进行了优化。我没有任何数组，我只有uint32_t输入整数。我的输入是小的非负数，通常在几百万以下，都可以被 3 整除。对于某个整数 N，概率密度在 [0 .. N*3] 范围内是均匀的。

以下代码在性能和分发质量方面是否足够好？

// Count of bits in the complete Bloom filter
// Using 32kb or memory to hopefully stay in L1D cache
constexpr size_t countBits = 1024 * 32 * 8;

// These bytes are from random.org
// Cryptographic strength is irrelevant, we only need performance here.
static const __m128i seed = _mm_setr_epi8( (char)0x8a, (char)0xf4, 0x30, (char)0x91, (char)0x07, 0x05, 0x45, (char)0x99, 0x2f, (char)0x95, 0x4a, (char)0xa2, (char)0x84, (char)0x88, (char)0xe6, (char)0x09 );

// Mask to extract lower bits from the hash
static const __m128i andMask = _mm_set1_epi32( countBits - 1 );

// Compute 16 bytes of hash function, mask upper bits in every 32-bit lane, computing 4 bit positions for the Bloom filter.
inline __m128i hash( uint32_t key )
{
    // Broadcast integer to 32-bit lanes
    __m128i a = _mm_set1_epi32( (int)key );
    // Abuse AES-NI decrypting instruction to make a 16-byte hash.
    // That thing only takes 4 cycles of latency to complete.
    a = _mm_aesdec_si128( a, seed );
    // Mask upper bits in the lanes so the _mm_extract_epi32 returns position of bits in the Bloom filter to set or test
    a = _mm_and_si128( a, andMask );
    return a;
}

更新：这个问题不是基于意见的，因为散列冲突的计数是微不足道的。正是出于这个原因，我指定了输入值的分布。

score 1 · Accepted Answer

听起来您可能根本不需要使用哈希函数：

我的输入是小的非负数，通常在几百万以下，都可以被 3 整除。对于某个整数 N ，概率密度在 [0 .. N*3] 范围内是均匀的。

对布隆过滤器使用散列函数的全部意义在于，给定一个概率密度不均匀的输入，使其均匀。当然，即使统计上每个值出现的可能性相同，给定的输入仍然可能与先前的输入相关（例如，靠近的值可能出现在组中），在这种情况下，您仍然可能需要应用哈希函数。

以下代码在性能和分发质量方面是否足够好？

这很难回答，只有您才能判断它是否符合您的应用程序的要求。单轮 AES 不应该被认为是加密安全的，即使是这样，使用已知的固定种子也会允许攻击者控制布隆过滤器的输入，以使布隆过滤器非常安全的方式分配输入。效率低下。但是你自己说：

[...] 哈希冲突的计数是可以轻松测量的。

您可以在对实际输入应用 AES 解密回合之前和之后测量哈希值的分布，从而检查它是否适合您。

c++ - 为布隆过滤器计算多个小整数散列的最佳方法是什么？

1 回答 1

Related

Reference