algorithm - KMP（在维基百科上）中的“部分匹配”表（又名“失败函数”）

Question

我正在阅读维基百科上的KMP 算法。“建表算法的伪代码描述”部分中有一行代码让我感到困惑：let cnd ← T[cnd]

它有一条评论：(second case: it doesn't, but we can fall back)，我知道我们可以后退，但为什么是 T[cnd]，有原因吗？因为它真的让我很困惑。

这是建表算法的完整伪代码：

algorithm kmp_table:
    input:
        an array of characters, W (the word to be analyzed)
        an array of integers, T (the table to be filled)
    output:
        nothing (but during operation, it populates the table)

    define variables:
        an integer, pos ← 2 (the current position we are computing in T)
        an integer, cnd ← 0 (the zero-based index in W of the next 
character of the current candidate substring)

    (the first few values are fixed but different from what the algorithm 
might suggest)
    let T[0] ← -1, T[1] ← 0

    while pos < length(W) do
        (first case: the substring continues)
        if W[pos - 1] = W[cnd] then
            let cnd ← cnd + 1, T[pos] ← cnd, pos ← pos + 1

        (second case: it doesn't, but we can fall back)
        else if cnd > 0 then
            let cnd ← T[cnd]

        (third case: we have run out of candidates.  Note cnd = 0)
        else
            let T[pos] ← 0, pos ← pos + 1

score 1 · Accepted Answer

您可以回退到，因为它包含模式WT[cnd]的前一个最长专有前缀的长度，它也是的专有后缀。因此，如果当前字符 at与字符 at 匹配，您可以延长最长正确前缀的长度（这是第一种情况）。W[0...cnd]W[pos-1]W[T[cnd]]W[0...pos-1]

我想这有点像动态编程，您依赖先前计算的值。

这个解释可能会对你有所帮助。

algorithm - KMP（在维基百科上）中的“部分匹配”表（又名“失败函数”）

1 回答 1

Related

Reference