mysql - 如何设计一个搜索短语表以查找一组单词的 MySQL 查询？

Question

我有一张大约有 100,000 行的表。

每行包含一个句子、句子片段或短语。

我想编写一个查询，它将查找包含所有单词集的所有行，即使条件中的单词与句子的顺序不同。

例如，如果我的表如下所示：

id sentence
-- ---------------------------------------------------------------------------
 1 How now brown cow
 2 Alas, poor Yorick! I knew him
 3 Call me Ishmael
 4 A screaming comes across the sky
 5 It was a bright cold day in April, and the clocks were striking thirteen
 6 It was the best of times, it was the worst of times
 7 You don't know about me without you have read a book
 8 In the late summer of that year we lived in a house in a village
 9 One summer afternoon Mrs. Oedipa Maas came home from a Tupperware party
10 It was a queer, sultry summer, the summer they electrocuted the Rosenbergs

我的查询条件将是一个或多个单词，以任何特定顺序。

结果集应该包含所有包含所有单词的句子。

例如，如果条件为the was，则结果应包括第 5、6、10 行。

理想情况下，我想改进这一点，以便查询只需要包含单词的开头。（请注意，我希望允许用户只输入单词的开头，而不能只输入中间或结尾）。

例如，如果条件是elect sul，则结果将包括第 10 行。

目前，我是这样做的：

SELECT
    id, sentence
WHERE
    (sentence LIKE 'elect%' OR sentence LIKE '% elect%')
AND
    (sentence LIKE 'sul%' OR sentence LIKE '% sul%')

这有效（我认为......） - 它找到了它应该找到的一切。但是，它非常缓慢。

有一个更好的方法吗？

对于它的价值 - 我可以灵活地重新设计表格，或创建额外的“帮助”表格。

例如，我考虑创建一个表，其中包含每个唯一单词的一行以及包含它的句子的每一行的键。

另外 - 查询需要在 MySQL 中工作。

提前谢谢了。

score 2 · Accepted Answer

你的方法很好。如果要处理多个单词，可以执行以下操作：

select s.id, s.sentence
from sentence s join
     (select 'elect' as word union all
      select 'sul' as word
     ) words
     on s.sentence like concat(word, '%') or
        s.sentence like concat('% ', word, '%')
group by s.id, s.sentence
having count(*) = (select count(*) from words)

这不会更快（因为你有额外的group by）。但它确实提供了更多的灵活性。

顺便问一下，您是否研究过 MySQL 中的全文搜索功能？

mysql - 如何设计一个搜索短语表以查找一组单词的 MySQL 查询？

1 回答 1

Related

Reference