我正在实现 karp-rabin 子字符串匹配算法。当我在子字符串上调用 hash_string() 方法时,我的实现工作正常,但当我实现滚动哈希时失败。我的滚动哈希值不断增长,我不知道为什么。
def hash_string(string, base):
power = len(string) - 1
hash_value = 0
for i in range(power, -1, -1):
hash_value += (ord(string[i]) * (base ** power))
return hash_value
def karp_rabin(string, substring):
substrhash = hash_string(substring, 256)
rolling_hash_val = hash_string(string[0:len(substring)], 256)
for i in range(len(string) - len(substring) + 1):
if substrhash == rolling_hash_val and string[i:i+len(substring)] == substring:
return i
if i < len(string) - len(substring):
print rolling_hash_val
print (ord(string[i]) * (256 ** (len(substring) - 1))) * 256
rolling_hash_val = (rolling_hash_val - (ord(string[i]) * (256 ** (len(substring) - 1)))) * 256 + ord(string[i + len(substring)])
print karp_rabin('ababababaababab', 'aab')
更具体地说,问题出现在这里:
rolling_hash_val = (rolling_hash_val - (ord(string[i]) * (256 ** (len(substring) - 1)))) * 256 + ord(string[i + len(substring)])
即使子字符串长度保持不变,滚动哈希值也会增加几个数量级。这种滚动哈希实现是否正确?