0

It's a confusion problem - so the wording may seem hard to follow. I'm likely over complicating a simple problem. Added an example question to help figure this out.

Example Question

Finding a 5 letter word from a database with the characters hxinenvarbav. I've organized the words in the DB to also contain a column of the word in alphabetical format. This means, the word "happy" has a column with the value "ahppy", so from the letters hxinenvarbav I can alphabetically organize them using the following code.

<?
    $letters = str_split('hxinenvarbav'); asort($letters);
    $letters = implode('',$letters); // returns 'aabehinnrvvx'
?>

The Issue

However, I can't simply search with mysql "LIKE '%aabehinnrvvx%' " and find 5 letter words with those characters, as obviously that will not pull any results. Unless maybe there is a MySQL query I could do? Maybe organize the column differently. I can however, use str_split($letters,5) to take 5 letters chunks of the 12 letter combination.

How would I go about, splitting in chunks each possible 5 letter combination from these 12 letters while keeping in mind, I need to query the table.

Does this make sense? Do I need to elaborate any further? Likely, I'm just over thinking and can't seem to simplify what it is I'm trying to accomplish. I have some complex mathematics that can find all possible combinations. But since I have placed in alphabetical order, I'm only searching combinations - not permutations. And on top of that, I don't need to as far as I logically believe, query 'each' combination. As there are 792 possible 5 letter combinations from only 12 letters (without calculating repeating characters). So 792 query calls is not nice - and 792 OR statements in my query, is clearly not an option. LOL!.

Any suggestions?

I did just think about searching via available characters left from alphabet - but, some words have repeating letters so that's not an option either.

4

2 回答 2

1

如果您有一个名为“dict”的表,其中包含字段“word”和“combo”,其中“combo”包含每个“word”的字母字符,那么您可以索引“combo”。

您以编程方式在内存中构建您的组合集,并使用它基于使用 IN 子句的每个组合构建 SELECT 语句,例如“SELECT * FROM dict WHERE combo IN ('combo1', 'combo2', ..., 'comboN ');"。

应该非常快速且易于实施。

于 2013-02-25T03:31:45.187 回答
0

出于性能原因,您可能不会仅使用一条 SQL 语句来完成此操作,而是结合使用 SQL 和查询后过滤。

select * from A where word_len = 5 and (
 substring( word_in_db, 1, 1) IN ('h', 'x', 'i', 'n', 'e', 'n', 'v', 'a', 'r', 'b', 'a', 'v')
 and 
 substring( word_in_db, 2, 1) IN ('h', 'x', 'i', 'n', 'e', 'n', 'v', 'a', 'r', 'b', 'a', 
-- etc...
)

这会将条款的数量限制为目标单词中的字母数量。

这不会像具有 2 个 E 但输入的字母只有 1 个 E 的单词那样找到重复项。您可能希望计算单词长度并将其保存为速度的派生值(当然,还要索引列)。

于 2013-02-26T14:33:14.347 回答