ruby - 从单词列表中查找给定单词中的单词的脚本

Question

我有一本包含 25 万字的字典（txt 文件）。对于这些单词中的每一个，我想提出一个脚本，它将抛出所有可能的字谜（每个字谜也应该在字典中）。

理想情况下，脚本将以这种格式输出：

单词1：字谜1，字谜2...

字2：字谜1，字谜2...

任何帮助将不胜感激。

score 1 · Accepted Answer

它必须是字谜周。

我将向您推荐我提交给先前问题的答案：https ://stackoverflow.com/a/12811405/128421 。它展示了如何构建散列以快速搜索具有常见字母的单词。

为了您的目的，查找子字符串/内部词，您还需要找到可能的内部词。以下是如何根据起始词快速定位不同大小字母的独特组合：

word = 'misses'
word_letters = word.downcase.split('').sort
3.upto(word.length) { |i| puts word_letters.combination(i).map(&:join).uniq }

eim
eis
ems
ess
ims
iss
mss
sss
eims
eiss
emss
esss
imss
isss
msss
eimss
eisss
emsss
imsss
eimsss

一旦你有了这些组合，将它们拆分（或不做join）并在我之前的答案构建的哈希中进行查找。

score 1 · Accepted Answer

受此启发，我建议您创建一个Trie。

然后，具有 N 个级别的 trie 将具有所有可能的字谜（其中 N 是原始单词的长度）。现在，要获得不同大小的单词，我建议您简单地遍历特里树，即。对于所有 3 个字母的子词，只需在 trie 中创建 3 级深度的所有字符串。

我不太确定，因为我没有对此进行测试，但这是一个有趣的挑战，这个建议将是我开始解决它的方法。

希望对您有所帮助 =)

score 0 · Accepted Answer

到目前为止我尝试了什么Perl：

use strict;
use warnings;

use Algorithm::Combinatorics qw(permutations);

die "First argument should be a dict\n" unless $ARGV[0] or die $!;
open my $fh, "<", $ARGV[0] or die $!;

my @arr = <$fh>;
my $h = {};

map { chomp; $h->{lc($_)} = [] } @arr;

foreach my $word (@arr) {
    $word = lc($word);
    my $chars = [ ( $word =~ m/./g ) ];
    my $it = permutations($chars);

    while ( my $p = $it->next ) {
        my $str = join "", @$p;

        if ($str ne $word && exists $h->{$str}) { 
            push @{ $h->{$word} }, $str
                unless grep { /^$str$/ } @{ $h->{$word} };
        }
    }

    if (@{ $h->{$word} }) {
        print "$word\n";
        print "\t$_\n" for @{ $h->{$word} };
    }
}

END{ close $fh; }

速度可能有一些可能的改进，但它确实有效。

我使用包中的法语字典words archlinux。

例子

$ perl annagrammes.pl /usr/share/dict/french
abaissent
        absentais
        abstenais
abaisser
        baissera
        baserais
        rabaisse
(...)

注意要安装 perl 模块：

cpan -i Algorithm::Combinatorics

score 0 · Accepted Answer

0

h = Hash.new{[]}
array_of_words.each{|w| h[w.downcase.chars.sort].push(w)}
h.values

于 2012-10-11T01:06:46.373 回答

ruby - 从单词列表中查找给定单词中的单词的脚本

4 回答 4

Related

Reference