3

我从 Wikipedia 检查了KMP 建表算法,但我不明白 while 循环的第二种情况背后的逻辑

(second case: it doesn't, but we can fall back)
        else if cnd > 0 then
            let cnd ← T[cnd]

我尝试使用此算法构建一个表,并且效果很好。我知道这cnd ← T[cnd]有助于找到合适的后缀长度。我不明白的是“如何”做到这一点?

一个例子的解释会很好。

谢谢!

编辑:我刚刚发现我的问题与此处的问题重复:KMP 中的“部分匹配”表(又名“失败函数”)(在维基百科上)

我想我现在得到了答案。不过,再做一个解释会很有帮助。谢谢!

4

1 回答 1

3

假设您有一个字符串Hello World!!!并且您想要搜索Head Up.

Hello World!!!
Head Up
  ^

当您在第一个和第二个字符中时,第一个条件适用(first case: the substring continues),在标记位置的情况下,该字符不匹配但您已经在一个子字符串匹配中(直到那里匹配到 2 个字符),这种情况对应到第二个条件(second case: it doesn't, but we can fall back)。第三种情况是您错过匹配模式的第一个字符。

第二个条件是必要的,因为您可以使用匹配字符的信息直到未匹配,以避免不必要的比较,您可能已经知道结果(跳过string您已经知道模式的开始部分不会的字符' t 匹配)。

示例:使用字符串HeHello World!!!并搜索Hello

HeHello World!!!
Hello
  ^ when you miss match this character using the table of KMP you known that 
    could skip 2 characters because

HeHello World!!!
 Hello
 ^ this would miss match

在为 pattern 构建模式表的情况下HeHello。假设^cnd*pos。起点是pos = 2and cnd = 0(但在检查模式时是 with pos - 1 = 1)。

HeHeHello     T [-1,0,0,0,0,0,0,0,0]
^* comparing 0 with 1 go to condition 3        cnd = 0, pos = 2
                        _
HeHeHello     T [-1,0,0,1,0,0,0,0,0]
^ * comparing 0 with 2 go to condition 1       cnd = 0, pos = 3
                          _
HeHeHello     T [-1,0,0,1,2,0,0,0,0]
 ^ * comparing 1 with 3 go to condition 1      cnd = 1, pos = 4
                            _
HeHeHello     T [-1,0,0,1,2,3,0,0,0]
  ^ * comparing 2 with 4 go to condition 1     cnd = 2, pos = 5
                              _
HeHeHello     T [-1,0,0,1,2,3,4,0,0]
   ^ * comparing 3 with 5 go to condition 1    cnd = 3, pos = 6

HeHeHello     T [-1,0,0,1,2,3,4,0,0]
    ^ * comparing 4 with 6 go to condition 2 (cnd = T[cnd], cnd = T[4] = 2)

HeHeHello     T [-1,0,0,1,2,3,4,0,0]
  ^   * comparing 2 with 6 go to condition 2 (cnd = T[cnd], cnd = T[2] = 0)

...
于 2014-09-10T16:07:56.127 回答