2

向多表全文布尔搜索添加唯一键时,结果会在 3 个任意状态中的 1 个中循环,只有 1 个是正确的。

在检查下面的 sqlfiddle 时请记住这一点,因为查询最初可能正常工作 - 在这种情况下,在左侧面板中添加空格然后重新构建并重新运行 - 然后它应该被破坏(但它是非常偶然的)。

http://sqlfiddle.com/#!9/8d95ba/18

这是有问题的查询:

SELECT `i`.`item_id`, `g_a`.`alias` AS `group`, `i`.`name` AS `name`
  FROM `item` `i`
  JOIN `group_alias` `g_a` USING (group_id)
    WHERE
      MATCH (`g_a`.`alias`) AGAINST ('Mac*' IN BOOLEAN MODE)
    OR
      MATCH (`i`.`name`) AGAINST ('Mac*' IN BOOLEAN MODE);

很简单。但是添加了以下唯一索引:

ALTER TABLE `item_with_unique` ADD UNIQUE INDEX `unique_item_group` (`group_id`, `name`)

结果在这三种状态之间任意循环:

  1. 返回所有行,就好像没有 WHERE 子句一样
  2. 返回别名匹配,就好像 WHERE 子句没有 OR 部分一样
  3. 返回正确的结果(根据我的经验,这是最罕见的)

行为似乎与它所处的这 3 种状态中的任何一种保持一致,直到查询以某种较小的方式更改(例如添加括号)或架构被重建——此时它可能会改变。

这些是我在描述这种行为的 MySQL 文档中遗漏的某种限制吗?它是一个错误吗?还是我刚刚做了一些明显错误的事情?

Mysql 版本 5.6.35(撰写本文时的 sqlfiddle)。

Sqlfiddle 供后代使用,以防链接失效:

CREATE TABLE `group` (
  `group_id` INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
  `name` VARCHAR(256),
  FULLTEXT INDEX `search` (`name`)
) ENGINE = InnoDB;

CREATE TABLE `group_alias` (
  `group_id` INT UNSIGNED NOT NULL,
  `alias` VARCHAR(256),
  CONSTRAINT `alias_group_id`
    FOREIGN KEY (`group_id`)
    REFERENCES `group` (`group_id`),
  FULLTEXT INDEX `search` (`alias`)
) ENGINE = InnoDB;

CREATE TABLE `item` (
  `item_id` INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
  `group_id` INT UNSIGNED,
  `name` VARCHAR(255) NOT NULL,
  CONSTRAINT `item_group_id`
    FOREIGN KEY (`group_id`)
    REFERENCES `group` (`group_id`),
  FULLTEXT INDEX `search` (`name`)
) ENGINE = InnoDB;

CREATE TABLE `item_with_unique` LIKE `item`;
ALTER TABLE `item_with_unique` ADD UNIQUE INDEX `unique_item_group` (`group_id`, `name`);

INSERT INTO `group` (`group_id`, `name`) VALUES (1, 'Thompson');
INSERT INTO `group` (`group_id`, `name`) VALUES (2, 'MacDonald');
INSERT INTO `group` (`group_id`, `name`) VALUES (3, 'Stewart');

INSERT INTO `group_alias` (`group_id`, `alias`) VALUES (1, 'Tomson');
INSERT INTO `group_alias` (`group_id`, `alias`) VALUES (2, 'Something');
INSERT INTO `group_alias` (`group_id`, `alias`) VALUES (3, 'MacStewart');

INSERT INTO `item` (`item_id`, `group_id`, `name`) VALUES (1, 1, 'MacTavish');
INSERT INTO `item` (`item_id`, `group_id`, `name`) VALUES (2, 1, 'MacTavish; Red');
INSERT INTO `item` (`item_id`, `group_id`, `name`) VALUES (3, 2, 'MacAgnew');
INSERT INTO `item` (`item_id`, `group_id`, `name`) VALUES (4, 3, 'Spider');
INSERT INTO `item` (`item_id`, `group_id`, `name`) VALUES (5, 2, 'blahblah');

INSERT INTO `item_with_unique` SELECT * FROM `item`;


SELECT `i`.`item_id`, `g_a`.`alias` AS `group`, `i`.`name` AS `name`,
IF(MATCH (`g_a`.`alias`) AGAINST ('Mac*' IN BOOLEAN MODE), 1, 0) AS `group_match`,
IF(MATCH (`i`.`name`) AGAINST ('Mac*' IN BOOLEAN MODE), 1, 0) AS `item_match`
  FROM `item` `i`
  JOIN `group_alias` `g_a` USING (group_id)
    WHERE
      MATCH (`g_a`.`alias`) AGAINST ('Mac*' IN BOOLEAN MODE)
    OR
      MATCH (`i`.`name`) AGAINST ('Mac*' IN BOOLEAN MODE);

SELECT "Same query, using table with unique index (NOTE: sporadically this is actually correct, in such case, skip to bottom notes)";
SELECT `i`.`item_id`, `g_a`.`alias` AS `group`, `i`.`name` AS `name`,
IF(MATCH (`g_a`.`alias`) AGAINST ('Mac*' IN BOOLEAN MODE), 1, 0) AS `group_match`,
IF(MATCH (`i`.`name`) AGAINST ('Mac*' IN BOOLEAN MODE), 1, 0) AS `item_match`
  FROM `item_with_unique` `i`
  JOIN `group_alias` `g_a` USING (group_id)
    WHERE
      MATCH (`g_a`.`alias`) AGAINST ('Mac*' IN BOOLEAN MODE)
    OR
      MATCH (`i`.`name`) AGAINST ('Mac*' IN BOOLEAN MODE);

SELECT "Union of the two OR match conditions seperately (expected result from second query)";
SELECT `i`.`item_id`, `g_a`.`alias` AS `group`, `i`.`name` AS `name`,
IF(MATCH (`g_a`.`alias`) AGAINST ('Mac*' IN BOOLEAN MODE), 1, 0) AS `group_match`,
IF(MATCH (`i`.`name`) AGAINST ('Mac*' IN BOOLEAN MODE), 1, 0) AS `item_match`
  FROM `item_with_unique` `i`
  JOIN `group_alias` `g_a` USING (group_id)
    WHERE
      MATCH (`g_a`.`alias`) AGAINST ('Mac*' IN BOOLEAN MODE)
UNION
SELECT `i`.`item_id`, `g_a`.`alias` AS `group`, `i`.`name` AS `name`,
IF(MATCH (`g_a`.`alias`) AGAINST ('Mac*' IN BOOLEAN MODE), 1, 0) AS `group_match`,
IF(MATCH (`i`.`name`) AGAINST ('Mac*' IN BOOLEAN MODE), 1, 0) AS `item_match`
  FROM `item_with_unique` `i`
  JOIN `group_alias` `g_a` USING (group_id)
    WHERE
      MATCH (`i`.`name`) AGAINST ('Mac*' IN BOOLEAN MODE);

SELECT "Now rebuild the schema (add a newline somewhere so sqlfiddle thinks it has changed) and observe that the results of the second query.  It may take multiple attempts but it usually cycles between 3 states:";
SELECT "1: Returns ALL results as if there were no conditions (5 rows)";
SELECT "2: Returns results as if there were no second part to the OR condition (1 row)";
SELECT "3: Returns the correct results (rarely)";
4

2 回答 2

1

尝试使用IGNORE INDEX您的陈述:

SELECT `i`.`item_id`, `g_a`.`alias` AS `group`, `i`.`name` AS `name`
  FROM `item` `i`
  IGNORE INDEX (unique_item_group)
  JOIN `group_alias` `g_a` USING (group_id)
    WHERE
      MATCH (`g_a`.`alias`) AGAINST ('Mac*' IN BOOLEAN MODE)
    OR
      MATCH (`i`.`name`) AGAINST ('Mac*' IN BOOLEAN MODE);

MySQL 也非常愚蠢,无法随机unique_item_group用于全文搜索。

于 2019-10-28T13:33:27.010 回答
0

如果您有一个单词的名称和别名。您正在检查整个值或前导值。那么 FULLTEXT 不是您需要的索引类型。

一个简单的INDEX(name), withname LIKE 'Mac%'将非常有效。

如果您有一个包含很多单词的长短语,而“MacDonald”可能就在其中,那么 就是正确的方法。FULLTEXTMATCH ... AGAINST

使用任一类型的索引,

WHERE table1 ...
   OR table2 ...

将是低效的。粗略地说,优化器必须执行“交叉连接”来获取两个表之间的所有行组合,然后查看其中哪些匹配一个或其他匹配/类似。

也许您已经“过度规范化”了数据?不能两者都namealias同一张桌子上吗?查询会快得多,并且会有优化技术使其更快。只有 1K 行,你所拥有的会明显变慢;我提出的建议可以优化到数百万甚至数十亿行之外。

于 2017-07-24T20:18:51.283 回答