string - 对 KMP 中的预处理表感到困惑

Question

查看 KMP 算法，并对 KMP 中计算后缀-前缀计数表的特定行感到困惑。

algorithm kmp_table：输入：一个字符数组，W（要分析的单词）一个整数数组，T（要填充的表格）输出：什么都没有（但在操作期间，它填充表格）

define variables:
    an integer, pos ← 2 (the current position we are computing in T)
    an integer, cnd ← 0 (the zero-based index in W of the next 
    character of the current candidate substring)

(the first few values are fixed but different from what the algorithm 
might suggest)
let T[0] ← -1, T[1] ← 0

while pos is less than the length of W, do:
    (first case: the substring continues)
    if W[pos - 1] = W[cnd], 
      let cnd ← cnd + 1, T[pos] ← cnd, pos ← pos + 1

    (second case: it doesn't, but we can fall back)
    otherwise, if cnd > 0, let cnd ← T[cnd]

    (third case: we have run out of candidates.  Note cnd = 0)
    otherwise, let T[pos] ← 0, pos ← pos + 1

以上直接取自维基百科。cnd > 0如果为什么 set ，我有点困惑cnd := T[cnd]，不应该将 cnd 重置回 0 就好像我们重新开始一样？

score 0 · Accepted Answer

显然，T[0] = -1因此设置cnd为T[cnd = 0] = -1将相当于W[cnd = -1]在下一次迭代中读取，这在字符串之外。至少出于这个原因，您需要对cnd > 0vs进行特殊处理cnd == 0。

cnd我们与0进行比较的真正原因T[cnd]是，W[]当W[cnd]. T[0]但是，不能用于此目的，因为 . 的左侧没有任何内容W[0]。

为什么设置 cnd := T[cnd]，不应该将 cnd 重置回 0，就好像我们重新开始一样？

您错过了算法的全部要点。如果您在部分匹配后从位置 0 重新开始，您将回到原始算法。T[]包含倒带位置，正如您从下面的示例表中看到的W[]那样T[]，它并不总是 0。因此，您有时会转到其他位置并从那里继续匹配，而不是一直回到位置 0。这就是使算法比简单算法更具可扩展性的原因。

string - 对 KMP 中的预处理表感到困惑

1 回答 1

Related

Reference