我有一个包含 80,000 行的数据库,当我测试一些 FULLTEXT 查询时,我遇到了意外的结果。我已经从 MYSQL 中删除了停用词,并将最小字长设置为 3。
当我执行此查询时:
SELECT `sentence`, MATCH (`sentence`) AGAINST ('CAN YOU FLY') AS `relevance`
FROM `sentences`
WHERE MATCH (`sentence`) AGAINST ('CAN YOU FLY')
ORDER BY `relevance` DESC
它给出了这个结果:
NO A FLY WITHOUT WINGS WOULD BE CALLED A WINGLESS | 10.623517036438
I CAN FLY | 7.61278629302979
I CAN FLY :) | 7.61278629302979
CAN YOU FLY? | 7.61278629302979
THEY CAN FLY | 7.61278629302979
YOU AM NOT FLY | 7.61278629302979
CAN YOU FLY | 7.61278629302979
HAVE YOU EVER SWALLOWED A FLY? | 7.52720737457275
I JUST WANNA FLY | 7.52720737457275
为什么“NO A FLY WITHOUT WINGS WOULD BE CALLED A WINGLESS”的相关性最高,它只包含一个单词……另外,“CAN YOU FLY”怎么不在顶部,是完全匹配的。
我希望它按最匹配的关键字排序,然后按最匹配的关键字排序,然后按最少的单词排序。这将给出合乎逻辑的结果:
CAN YOU FLY
CAN YOU FLY?
I CAN FLY
THEY CAN FLY
I CAN FLY :)
YOU AM NOT FLY
HAVE YOU EVER SWALLOWED A FLY?
I JUST WANNA FLY
NO A FLY WITHOUT WINGS WOULD BE CALLED A WINGLESS