这可以使用Rabin–Karp 算法为滑动窗口计算 a有效地解决rolling hash
,一个简单的滚动哈希函数是对字符的 ASCII 码求和,但是你可以使用这个数组primes
来减少冲突,我已经测试了这些素数并在大而相似的字符串和模式匹配上给了我一些冲突:
primes[] = {13 , 7963 , 17443 , 27527 , 37879 , 48673 , 59407 , 70729 , 81883 , 93251 , 104789 , 116531 , 128239 , 139969 , 151783 , 163883 , 176159 , 188029 , 200257 , 212447 , 224831 , 237283 , 249517 , 262217 , 274661 , 287173};
这是上述算法打印匹配数的伪代码:
stream = "aaabaabaabbab";
pattern = "aab";
queue window;
patternHash = 0;
for ch in pattern:
patternHash = patternHash + primes[ch - 'a']
first = readFromStream(stream)
window.enqueue(first)
windowHash = primes[first - 'a']
for i = 0 to pattern.size():
ch = readFromStream(stream)
window.enqueue(ch)
windowHash = windowHash + primes[ch - 'a']
count = 0
for i = pattern.size() to stream.size():
if windowHash == patternHash
if window == pattern
count = count + 1
ch = readFromStream(stream)
window.enqueue(ch)
windowHash = windowHash - primes[window.first() - 'a']
windowHash = windowHash + primes[ch - 'a']
window.dequeue()
print count