4

I'm sure you've all heard of the "Word game", where you try to change one word to another by changing one letter at a time, and only going through valid English words. I'm trying to implement an A* Algorithm to solve it (just to flesh out my understanding of A*) and one of the things that is needed is a minimum-distance heuristic.

That is, the minimum number of one of these three mutations that can turn an arbitrary string a into another string b: 1) Change one letter for another 2) Add one letter at a spot before or after any letter 3) Remove any letter

Examples

aabca => abaca:
aabca
abca
abaca
= 2

abcdebf => bgabf:
abcdebf
bcdebf
bcdbf
bgdbf
bgabf
= 4

I've tried many algorithms out; I can't seem to find one that gives the actual answer every time. In fact, sometimes I'm not sure if even my human reasoning is finding the best answer.

Does anyone know any algorithm for such purpose? Or maybe can help me find one?

(Just to clarify, I'm asking for an algorithm that can turn any arbitrary string to any other, disregarding their English validity-ness.)

4

3 回答 3

6

您想要最小编辑距离(或 Levenshtein 距离)

两个字符串之间的 Levenshtein 距离定义为将一个字符串转换为另一个字符串所需的最小编辑次数,允许的编辑操作是插入、删除或替换单个字符。它以弗拉基米尔·列文斯坦 (Vladimir Levenshtein) 的名字命名,他在 1965 年考虑到了这个距离。

确定编辑顺序的一种算法在同一页面

于 2010-05-13T23:40:33.557 回答
2

S. Dasgupta、CH Papadimitriou 和 UV Vazirani 编写的算法教科书的第 6.3 节是关于“编辑距离”的绝佳参考,其草稿可在此处免费获得。

于 2010-05-15T22:34:03.450 回答
1

如果您有一个合理大小(小)的字典,那么广度优先树搜索可能会起作用。

所以从你的单词可以变异成的所有单词开始,然后所有可以变异成的单词(除了原来的),然后再到第三级……直到找到你要找的单词。

您可以消除发散的单词(离目标较远的单词),但这样做可能会导致您在必须通过某种发散状态才能到达最短路径的情况下失败。

于 2010-05-13T23:40:29.050 回答