string - 确定性自动机在另一个字符串的字符串中查找子序列的数量

Question

确定性自动机查找字符串中的子序列数？如何构造 DFA 以查找出现字符串的数量作为另一个字符串中的子序列？

例如。在 "ssstttrrriiinnngggg" 中，我们有 3 个形成字符串 "string" 的子序列？

同样要找到和要搜索的字符串都只包含来自特定字符集的字符。我有一些关于将字符存储在堆栈中的想法，相应地弹出它们直到我们匹配，如果不匹配再次推送。请告诉 DFA 解决方案？

score 0 · Accepted Answer

重叠匹配

如果您想计算重叠序列的数量，那么您只需构建一个与字符串匹配的 DFA，例如

1 -(如果看到 s)-> 2 -(如果看到 t)-> 3 -(如果看到 r)-> 4 -(如果看到 i)-> 5 -(如果看到 n)-> 6 -(如果看到g)-> 7

然后使用动态规划计算看到每个字符后处于每个状态的方式数。有关更多详细信息，请参阅此问题的答案。

DP[a][b] = number of ways of being in state b after seeing the first a characters
         = DP[a-1][b] + DP[a-1][b-1] if character at position a is the one needed to take state b-1 to b
         = DP[a-1][b] otherwise

对于 b>1 和 DP[0][1]=1，从 DP[0][b]=0 开始。

那么重叠字符串的总数为 DP[len(string)][7]

非重叠匹配

如果您正在计算非重叠序列的数量，那么如果我们假设要匹配的模式中的字符是不同的，我们可以稍微修改一下：

DP[a][b] = number of strings being in state b after seeing the first a characters
         = DP[a-1][b] + 1 if character at position a is the one needed to take state b-1 to b and  DP[a-1][b-1]>0
         = DP[a-1][b] - 1 if character at position a is the one needed to take state b to b+1 and DP[a-1][b]>0
         = DP[a-1][b] otherwise

对于 b>1 和 DP[0][1]=infinity，从 DP[0][b]=0 开始。

那么非重叠字符串的总数为 DP[len(string)][7]

如果要匹配的模式包含重复的字符（例如“字符串”），这种方法不一定会给出正确的答案。

string - 确定性自动机在另一个字符串的字符串中查找子序列的数量

1 回答 1

重叠匹配

非重叠匹配

Related

Reference