c# - 如何让我的程序猜测正确的单词？

Question

我有兴趣做一些人工智能/算法探索。所以我有这个想法，做一个简单的应用程序，有点像hang man，我分配一个单词并留下一些字母作为线索。但是，我不想让用户猜测这个词，而是让我的应用程序尝试根据我留下的线索来弄清楚它。有谁知道我应该从哪里开始？谢谢。

score 3 · Accepted Answer

创建所需语言的单词数据库（索引维基百科转储）。
这可能不应该超过 100 万字。

然后你可以简单地查询一个数据库：

例如：fxxulxxs

--> SELECT * FROM T_Words WHERE word LIKE f__ul__s

--> 棒极了

如果返回集中的单词超过 1 个，则需要返回统计上最常用的单词。

另一种方法是看一下 nhunspell

如果您想进行更多分析，您需要找到一种统计方法来关联词干、结尾和开头，或者基本上是衡量单词相似度的方法。

语言研究表明，当你只有开头和结尾时，你可以很容易地阅读单词。如果你只有中间，那就很难了。

在此处输入图像描述

score 2 · Accepted Answer

您可能想查看某种形式的测量编辑距离的算法，例如Damerau-Levenshtein distance (wikipedia)。这通常用于在几个单词中找到与其他给定单词最匹配的一个单词。

在处理 DNA 和蛋白质序列时，它经常用于搜索和比较，但在您的情况下也可能有用。

score 0 · Accepted Answer

第一步是构建一个包含所有有效单词的数据结构，并且可以轻松查询该数据结构以检索与当前模式匹配的所有单词。然后使用这个匹配词列表，您可以计算最常见的字母以获得下一个候选词。另一种方法可能是找到将给出最小的下一个匹配词集的字母。

next_guess(pattern, played_chars, dictionary)
  // find all the word matching the pattern and not containing letters played
  // not in the pattern
  words_set = find_words_matching(pattern, played_chars, dictionary)

  // build an array containing for each letter the frequency in the words set
  letter_freq = build_frequency_array(words_set)

  // build an array containing the size of the words set if ever the letter appears at least once
  // in the word (I name it its power)
  letter_power = build_power_array(words_set)

  // find the letter minimizing a function (the AI part ?)
  // the function could take last two arrays in account
  // this is the AI part.
  candidate = minimize(weighted_function, letter_freq, letter_power)

c# - 如何让我的程序猜测正确的单词？

3 回答 3

Related

Reference