0

我需要一些关于我的 Karp-Rabin 算法的特定部分的帮助。我想要做的是用固定sliding window和单独appendskip部分来实现版本。Sliding window工作得很好。当我尝试将整体拆分为sliding window多个append部分时,就会出现问题skipAppend似乎工作正常,但这skip是过去几天让我头疼的事情。

问题- 我正在滑过包含几个订阅模式实例的字符串。Sliding window检测到它,但没有检测到其他两个。

这个想法是该结构保存(base ^ window size)mod prime number( )RH的预先计算值,因此我可以删除字符串的前导字符。这个值毕竟会随着窗口大小的变化而变化。为了减小 的值,乘法逆用于不处于模删除的情况(基模模值的逆)。它也是预先计算的。b2wmodappendskipb2wmod

以下是我感兴趣的代码部分。我不发布整个代码以免您阅读所有内容,但如果需要可以上传。乘法逆似乎计算正确,但我也可以上传代码。

将不胜感激任何帮助!先感谢您!

void
append_to_rh(RH rh)
{
    uint64_t hash    = rh->hash;
    uint64_t base    = rh->base;
    uint64_t mod     = rh->mod;
    uint64_t b2wmodm = rh->b2wmodm;
    char     new     = rh->new;
    
    hash             = ( hash * base + new ) % mod;    
    b2wmodm          = ( b2wmodm * base ) % mod;
    
    rh->hash         = hash;
    rh->b2wmodm      = b2wmodm;
}

void
skip(RH rh)
{
    uint64_t hash       = rh->hash;
    uint64_t base       = rh->base;
    uint64_t mod        = rh->mod;
    uint64_t b2wmodm    = rh->b2wmodm;
    uint64_t m_inv      = rh->m_inv;
    char     old        = rh->old;
    uint64_t correction = old * mod;
   
    b2wmodm = ( b2wmodm * (m_inv % mod) ) % mod;
    hash    = ( hash - old * b2wmodm + correction ) % mod;

    rh->hash    = hash;
    rh->b2wmodm = b2wmodm;
}

void
slide_window(RH rh)
{
    uint64_t base    = rh->base;
    uint64_t mod     = rh->mod;
    uint64_t hash    = rh->hash;
    uint64_t b2wmodm = rh->b2wmodm;
    char old = rh->old;
    char new = rh->new;

    hash     = ( hash * base - old * b2wmodm + new ) % mod;
    rh->hash = hash;
}
4

1 回答 1

1

您的appendskip功能工作正常。这是我用于测试的示例代码

#include <string>
#include <cassert>

typedef long long ll;

ll hash, b2wmodm, base, inv_base, mod;

// fast exponentiation, for calculating inv_base
ll exp(ll a, ll b){
    ll ans = 1;
    while (b){
        if (b&1){
            ans *= a;
            ans %= mod;
        }
        a *= a;
        a %= mod;
        b >>= 1;
    }
    return ans;
}

// calculates expected hash of the string
ll expected_hash(std::string s) {
    ll result = 0;
    ll multiplier = 1;
    for (int i = s.length()-1; i >= 0; i--) {
        result += s[i] * multiplier % mod;
        result %= mod;
        multiplier = multiplier * base % mod;
    }
    return result;
}

// same as your append and skip functions
void append_to_rh(ll newc) {
    hash = (hash * base + newc) % mod;
    b2wmodm = (b2wmodm * base) % mod;
}

void skip(ll old) {
    ll correction = old * mod;

    b2wmodm = (b2wmodm * (inv_base % mod)) % mod;
    hash = (hash - old * b2wmodm + correction) % mod;
}

int main() {

    base = 29;
    mod = 1000000007;

    hash = 0;
    b2wmodm = 1;
    inv_base = exp(base, mod-2);

    srand(time(nullptr));
    std::string s;
    for (int i = 0; i < 2000; i++) {
        if (i < 1000 || rand()%2) {
            char newchar = rand()%26 + 'a';
            s += newchar;
            append_to_rh(newchar);
            assert(expected_hash(s) == hash);
        } else {
            char oldchar = s[0];
            s = s.substr(1, s.length());
            skip(oldchar);
            assert(expected_hash(s) == hash);
        }
    }
}

我猜你代码的其他部分会导致问题。也许您正试图跳过一个空窗口,或者您可能使用非主要模块。

于 2021-04-16T02:59:34.030 回答