python - 为什么 Python 中的这个 Rabin-Karp 代码这么慢？

Question

from time import process_time_ns
def initial_hash(desired):
    u=0
    for i in desired:
        u=(u*256+ord(i))%89
    return u
def find(Document):
    initial=Document[:len(desired)]
    hash_1=initial_hash(initial)
    for i in range(1,len(Document)-len(desired)+1):
        hash_1=skip(Document[i-1],hash_1)
        hash_1=append(Document[len(desired)+i-1],hash_1)
def append(char,u):
    u=(u*256+ord(char))%89
    return u
def skip(char,u):
    u=(u-(ord(char)*(constante)))%89
    return u
to=process_time_ns()
desired="ababa"
constante=(256**(len(desired)-1))%89
hash_desired=initial_hash(desired)
text=open("match.txt","r")
for x in text:
    find(x)
print(process_time_ns()-to)

我已经按照 MIT course 6006 实现了这段代码。我在上面的代码中使用了一些标记来查看时间。我将其更改match.txt为许多不同的大小以及desirable变量，这是我想在文档中找到的那个。但是对于很多变体，链接中描述的朴素算法总是更快。我选择 89 作为素数。代码基于此链接：

https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/lecture-videos/MIT6_006F11_lec09_orig.pdf

score 1 · Accepted Answer

Rabin--Karp 比朴素算法具有更好的最坏情况渐近运行时间。两个限定符都很重要：naive 有一个线性时间的最佳和平均情况，并且这种最佳情况下的常数比 Rabin--Karp 更好，因此在很多情况下它实际上会更快。如果您想看到很大的不同，请为天真选择一个错误的输入，例如搜索aaaa(many repetitions of a)aaaabin aaaa(many more repetitions of a)aaaab。

python - 为什么 Python 中的这个 Rabin-Karp 代码这么慢？

1 回答 1

Related

Reference