0

我有一个城市和州的数据库(大约 43,000 个)。我对它进行全文搜索,如下所示:

select city, state, match(city, state_short, state) against (:q in boolean mode) as score
from zipcodes where
match(city, state_short, state) against (:q in boolean mode)
group by city, state order by score desc limit 6

当我:q用有意义的字符串替换时,它可以工作,但假设我搜索houston texas,我希望结果是第一个,但它是第 3 个:

  • North Houston, Texas
  • South Houston, Texas
  • Houston, Texas

我怎样才能使Houston, Texas体重比其他两个重?这显然对于其他类似的城市也应该是一样的。

编辑

这行得通,有什么想法吗?

SELECT * FROM (
    SELECT city, state, MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE) as score
    FROM zipcodes
    WHERE MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE)
    GROUP BY city, state
    ORDER BY score DESC LIMIT 6
) AS tbl
ORDER BY score DESC, LENGTH(city)
4

1 回答 1

1

您的新查询可能有效,但这完全是间接的。而不是做ORDER BY LENGTH(city),类似的事情ORDER BY ABS(LENGTH(:q) - (LENGTH(city) + LENGTH(state)))会更好。这并不完美,但它应该会更好,因为任何与输入具有相同长度和高分的东西可能就是您正在寻找的东西。最终查询看起来像这样:

SELECT city, state, MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE) AS score
FROM zipcodes
WHERE MATCH(city, state_short, state) AGAINST (:q IN BOOLEAN MODE)
GROUP BY city, state
ORDER BY score DESC, ABS(LENGTH(:q) - (LENGTH(city) + LENGTH(state))) DESC LIMIT 6

我将新ORDER BY子句移到主查询中以删除子查询。这应该会产生相同(或可能更准确)的结果。

Levenshtein 距离可能是对此更准确的度量,但在 MySQL 中没有它的本机实现。 这篇文章有更多关于 Levenshtein Distance 函数的 MySQL 实现的信息。

于 2013-03-21T14:16:21.113 回答