4

我想运行以下搜索:

schema->resultset('Entity')->search({ 
        -or => { "me.user_id" => $user_id, 'set_to_user.user_id' => $user_id } 
    }, {
        'distinct' => 1,
        'join' => {'entity_to_set' => {'entity_set' => 'set_to_user'}},
        'order_by' => {'-desc' => 'modified'},
        'page' => 1,'rows' => 100
    });

在具有如下所示表的数据库上。

CREATE TABLE entity (
  id varchar(500) NOT NULL,
  user_id varchar(100) NOT NULL,
  modified timestamp NOT NULL,
  PRIMARY KEY (id, user_id),
  FOREIGN KEY (user_id) REFERENCES user(id) ON DELETE CASCADE ON UPDATE CASCADE
);

CREATE TABLE entity_to_set (
  set_id varchar(100) NOT NULL,
  user_id varchar(100) NOT NULL,
  entity_id varchar(500) NOT NULL,
  PRIMARY KEY (set_id, user_id, entity_id),
  FOREIGN KEY (entity_id, user_id) REFERENCES entity(id, user_id) ON DELETE CASCADE ON UPDATE CASCADE,
  FOREIGN KEY (set_id) REFERENCES entity_set(id) ON DELETE CASCADE ON UPDATE CASCADE
);

CREATE TABLE entity_set (
  id varchar(100) NOT NULL,
  PRIMARY KEY (id)
);

CREATE TABLE set_to_user (
  set_id varchar(100) NOT NULL,
  user_id varchar(100) NOT NULL,
  PRIMARY KEY (set_id, user_id),
  FOREIGN KEY (user_id) REFERENCES user(id) ON DELETE CASCADE ON UPDATE CASCADE,
  FOREIGN KEY (set_id) REFERENCES entity_set(id) ON DELETE CASCADE ON UPDATE CASCADE
);

CREATE TABLE user (
  id varchar(100) NOT NULL,
  PRIMARY KEY (id)
);

我有大约 6000 entity、 6000 entity_to_set、 10entity_set和 50 set_to_user

现在,这个查询需要一些时间(一两秒),这很不幸。仅对实体表(包括 )进行查询时ORDER BY,结果几乎是即时的。作为调试的第一步,我发现了 DBIC 代码变成的实际 SQL 查询:

SELECT me.id, me.user_id, me.modified FROM entity me
LEFT JOIN entity_to_set entity_to_set ON ( entity_to_set.entity_id = me.id AND entity_to_set.user_id = me.user_id ) 
LEFT JOIN entity_set entity_set ON entity_set.id = entity_to_set.set_id 
LEFT JOIN set_to_user set_to_user ON set_to_user.set_id = entity_set.id 
WHERE ( ( set_to_user.user_id = 'Craigy' OR me.user_id = 'Craigy' ) ) 
GROUP BY me.id, me.user_id, me.modified ORDER BY modified DESC LIMIT 100;

这是结果EXPLAIN QUERY PLAN

0|0|0|SCAN TABLE entity AS me USING INDEX sqlite_autoindex_entity_1 (~1000000 rows)
0|1|1|SEARCH TABLE entity_to_set AS entity_to_set USING COVERING INDEX entity_to_set_idx_cover (entity_id=? AND user_id=?) (~9 rows)
0|2|2|SEARCH TABLE entity_set AS entity_set USING COVERING INDEX sqlite_autoindex_entity_set_1 (id=?) (~1 rows)
0|3|3|SEARCH TABLE set_to_user AS set_to_user USING COVERING INDEX sqlite_autoindex_set_to_user_1 (set_id=?) (~5 rows)
0|0|0|USE TEMP B-TREE FOR ORDER BY

entity_to_set_idx_cover在哪里

CREATE INDEX entity_to_set_idx_cover ON entity_to_set (entity_id, user_id, set_id);

现在,问题是用于排序的 b 树,而不是在我不进行连接时使用的索引。

我注意到 DBIx::Class 转换'distinct' => 1GROUP BY语句(我相信文档说它们在这里是等价的)。我删除了该GROUP BY语句并SELECT DISTINCT改为使用以下查询

SELECT DISTINCT me.id, me.user_id, me.modified FROM entity me
LEFT JOIN entity_to_set entity_to_set ON ( entity_to_set.entity_id = me.id AND entity_to_set.user_id = me.user_id ) 
LEFT JOIN entity_set entity_set ON entity_set.id = entity_to_set.set_id 
LEFT JOIN set_to_user set_to_user ON set_to_user.set_id = entity_set.id 
WHERE ( ( set_to_user.user_id = 'Craigy' OR me.user_id = 'Craigy' ) ) 
ORDER BY modified DESC LIMIT 100;

我相信这会产生相同的结果。对于EXPLAIN QUERY PLAN这个查询是

0|0|0|SCAN TABLE entity AS me USING COVERING INDEX entity_sort_modified_user_id (~1000000 rows)
0|1|1|SEARCH TABLE entity_to_set AS entity_to_set USING COVERING INDEX entity_to_set_idx_cover (entity_id=? AND user_id=?) (~9 rows)
0|2|2|SEARCH TABLE entity_set AS entity_set USING COVERING INDEX sqlite_autoindex_entity_set_1 (id=?) (~1 rows)
0|3|3|SEARCH TABLE set_to_user AS set_to_user USING COVERING INDEX sqlite_autoindex_set_to_user_1 (set_id=?) (~5 rows)

entity_sort_modified_user_id使用创建的索引在哪里

CREATE INDEX entity_sort_modified_user_id ON entity (modified, user_id, id);

这几乎是瞬间运行的(没有 b-tree)。

编辑:为了证明当ORDER BY按升序排列时问题仍然存在,以及索引对这些查询的影响,这里是对相同表的类似查询。前两个查询分别使用SELECT DISTINCTand没有索引GROUP BY,后两个查询和索引相同。

sqlite> EXPLAIN QUERY PLAN SELECT DISTINCT me.id, me.user_id, me.modified FROM entity me LEFT JOIN entity_to_set entity_to_set ON ( entity_to_set.entity_id = me.id AND entity_to_set.user_id = me.user_id ) LEFT JOIN entity_set entity_set ON entity_set.id = entity_to_set.set_id WHERE ( me.user_id = 'Craigy' AND entity_set.id = 'SetID' ) ORDER BY modified LIMIT 100;
0|0|0|SCAN TABLE entity AS me (~100000 rows)
0|1|1|SEARCH TABLE entity_to_set AS entity_to_set USING AUTOMATIC COVERING INDEX (entity_id=? AND user_id=?) (~7 rows)
0|2|2|SEARCH TABLE entity_set AS entity_set USING COVERING INDEX sqlite_autoindex_entity_set_1 (id=?) (~1 rows)
0|0|0|USE TEMP B-TREE FOR DISTINCT
0|0|0|USE TEMP B-TREE FOR ORDER BY
sqlite> EXPLAIN QUERY PLAN SELECT me.id, me.user_id, me.modified FROM entity me LEFT JOIN entity_to_set entity_to_set ON ( entity_to_set.entity_id = me.id AND entity_to_set.user_id = me.user_id ) LEFT JOIN entity_set entity_set ON entity_set.id = entity_to_set.set_id WHERE ( me.user_id = 'Craigy' AND entity_set.id = 'SetID' ) GROUP BY me.id, me.user_id, me.modified ORDER BY modified LIMIT 100;
0|0|0|SCAN TABLE entity AS me USING INDEX sqlite_autoindex_entity_1 (~100000 rows)
0|1|1|SEARCH TABLE entity_to_set AS entity_to_set USING AUTOMATIC COVERING INDEX (entity_id=? AND user_id=?) (~7 rows)
0|2|2|SEARCH TABLE entity_set AS entity_set USING COVERING INDEX sqlite_autoindex_entity_set_1 (id=?) (~1 rows)
0|0|0|USE TEMP B-TREE FOR ORDER BY
sqlite> CREATE INDEX entity_idx_user_id_modified_id ON entity (user_id, modified, id);
sqlite> EXPLAIN QUERY PLAN SELECT DISTINCT me.id, me.user_id, me.modified FROM entity me LEFT JOIN entity_to_set entity_to_set ON ( entity_to_set.entity_id = me.id AND entity_to_set.user_id = me.user_id ) LEFT JOIN entity_set entity_set ON entity_set.id = entity_to_set.set_id WHERE ( me.user_id = 'Craigy' AND entity_set.id = 'SetID' ) ORDER BY modified LIMIT 100;
0|0|0|SEARCH TABLE entity AS me USING COVERING INDEX entity_idx_user_id_modified_id (user_id=?) (~10 rows)
0|1|1|SEARCH TABLE entity_to_set AS entity_to_set USING AUTOMATIC COVERING INDEX (entity_id=? AND user_id=?) (~7 rows)
0|2|2|SEARCH TABLE entity_set AS entity_set USING COVERING INDEX sqlite_autoindex_entity_set_1 (id=?) (~1 rows)
sqlite> EXPLAIN QUERY PLAN SELECT me.id, me.user_id, me.modified FROM entity me LEFT JOIN entity_to_set entity_to_set ON ( entity_to_set.entity_id = me.id AND entity_to_set.user_id = me.user_id ) LEFT JOIN entity_set entity_set ON entity_set.id = entity_to_set.set_id WHERE ( me.user_id = 'Craigy' AND entity_set.id = 'SetID' ) GROUP BY me.id, me.user_id, me.modified ORDER BY modified LIMIT 100;
0|0|0|SEARCH TABLE entity AS me USING COVERING INDEX entity_idx_user_id_modified_id (user_id=?) (~10 rows)
0|1|1|SEARCH TABLE entity_to_set AS entity_to_set USING AUTOMATIC COVERING INDEX (entity_id=? AND user_id=?) (~7 rows)
0|2|2|SEARCH TABLE entity_set AS entity_set USING COVERING INDEX sqlite_autoindex_entity_set_1 (id=?) (~1 rows)
0|0|0|USE TEMP B-TREE FOR GROUP BY
0|0|0|USE TEMP B-TREE FOR ORDER BY

我的问题是:如何修复我的 DBIx::Class 代码,使其与SELECT DISTINCT查询一样好。或者如何添加索引以使其正常工作?还是需要其他类型的修复?

4

1 回答 1

1

注意:这不是这个问题的完整答案。它仅显示了在按升序排序时如何避免临时 b 树。当需要按降序排序时,目前(版本 3.8.1)没有办法(不调整 sqlite)来避免 GROUP BY 版本的临​​时 b 树。

使用问题中的表定义和索引:

sqlite> select sqlite_version();
sqlite_version()
----------------
3.8.1

当 (a) 您按升序排序并且 (b) GROUP BY 子句逐列匹配 ORDER BY 子句时,您的查询在没有临时 b 树的情况下运行。

除了 GROUP BY 和 ORDER BY 子句之外,查询没有改变:

/* table definitions as shown in the question */
sqlite> CREATE INDEX entity_to_set_idx_cover ON entity_to_set (entity_id, user_id, set_id);
sqlite> CREATE INDEX entity_sort_modified_user_id ON entity (modified, user_id, id);

sqlite> EXPLAIN QUERY PLAN
   ...> SELECT  me.id, me.user_id, me.modified FROM entity me
   ...> LEFT JOIN entity_to_set entity_to_set ON ( entity_to_set.entity_id = me.id AND entity_to_set.user_id = me.user_id )
   ...> LEFT JOIN entity_set entity_set ON entity_set.id = entity_to_set.set_id
   ...> LEFT JOIN set_to_user set_to_user ON set_to_user.set_id = entity_set.id
   ...> WHERE ( ( set_to_user.user_id = 'Craigy' OR me.user_id = 'Craigy' ) )
   ...> GROUP BY me.modified,  me.user_id, me.id
   ...> ORDER BY me.modified,  me.user_id, me.id ASC LIMIT 100;

selectid    order       from        detail
----------  ----------  ----------  -------------------------------------------------------------------------
0           0           0           SCAN TABLE entity AS me USING COVERING INDEX entity_sort_modified_user_id
0           1           1           SEARCH TABLE entity_to_set AS entity_to_set USING COVERING INDEX entity_t
0           2           2           SEARCH TABLE entity_set AS entity_set USING COVERING INDEX sqlite_autoind
0           3           3           SEARCH TABLE set_to_user AS set_to_user USING COVERING INDEX sqlite_autoi

但是,当您按降序排序时,您会得到一个临时 b-tree:

   ...> ...
   ...> GROUP BY me.modified,  me.user_id, me.id
   ...> ORDER BY me.modified,  me.user_id, me.id DESC LIMIT 100;
selectid    order       from        detail
----------  ----------  ----------  -------------------------------------------------------------------------
0           0           0           SCAN TABLE entity AS me USING COVERING INDEX entity_sort_modified_user_id
0           1           1           SEARCH TABLE entity_to_set AS entity_to_set USING COVERING INDEX entity_t
0           2           2           SEARCH TABLE entity_set AS entity_set USING COVERING INDEX sqlite_autoind
0           3           3           SEARCH TABLE set_to_user AS set_to_user USING COVERING INDEX sqlite_autoi
0           0           0           USE TEMP B-TREE FOR ORDER BY

原因是 sqlite(直到当前版本 3.8.1)不承认它可以按降序进行分组。因此,您将始终获得单独的步骤。这是无法避免的,即使索引也被声明为 DESC。请参阅关于此的 sqlite邮件列表的讨论。

结论 如果您希望查询在没有临时 b-tree 的情况下按顺序排列,则必须调整 SQL 生成以使用 DISTINCT。

于 2013-10-21T16:42:15.827 回答