5

更新:表和索引定义

desc activities;x
+----------------+--------------+------+-----+---------+  
| Field          | Type         | Null | Key | Default |  
+----------------+--------------+------+-----+---------+  
| id             | int(11)      | NO   | PRI | NULL    |  
| trackable_id   | int(11)      | YES  | MUL | NULL    |  
| trackable_type | varchar(255) | YES  |     | NULL    |  
| owner_id       | int(11)      | YES  | MUL | NULL    |  
| owner_type     | varchar(255) | YES  |     | NULL    |  
| key            | varchar(255) | YES  |     | NULL    |  
| parameters     | text         | YES  |     | NULL    |  
| recipient_id   | int(11)      | YES  | MUL | NULL    |  
| recipient_type | varchar(255) | YES  |     | NULL    |  
| created_at     | datetime     | NO   |     | NULL    |  
| updated_at     | datetime     | NO   |     | NULL    |  
+----------------+--------------+------+-----+---------+  

show indexes from activities;

+------------+------------+-----------------------------------------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+  
| Table      | Non_unique | Key_name                                            | Seq_in_index | Column_name    | Collation | Cardinality | Sub_part | Packed | Null | Index_type |  
+------------+------------+-----------------------------------------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+  
| activities |          0 | PRIMARY                                             |            1 | id             | A         |        7263 |     NULL | NULL   |      | BTREE      |  
| activities |          1 | index_activities_on_trackable_id_and_trackable_type |            1 | trackable_id   | A         |        7263 |     NULL | NULL   | YES  | BTREE      |  
| activities |          1 | index_activities_on_trackable_id_and_trackable_type |            2 | trackable_type | A         |        7263 |     NULL | NULL   | YES  | BTREE      |  
| activities |          1 | index_activities_on_owner_id_and_owner_type         |            1 | owner_id       | A         |        7263 |     NULL | NULL   | YES  | BTREE      |  
| activities |          1 | index_activities_on_owner_id_and_owner_type         |            2 | owner_type     | A         |        7263 |     NULL | NULL   | YES  | BTREE      |  
| activities |          1 | index_activities_on_recipient_id_and_recipient_type |            1 | recipient_id   | A         |        2421 |     NULL | NULL   | YES  | BTREE      |  
| activities |          1 | index_activities_on_recipient_id_and_recipient_type |            2 | recipient_type | A         |        3631 |     NULL | NULL   | YES  | BTREE      |  
+------------+------------+-----------------------------------------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+  

select count(id) from activities;  
+-----------+  
| count(id) |  
+-----------+  
|      7117 |  
+-----------+  

这是我当前查询的样子:

SELECT act.*, group_concat(act.owner_id order by act.created_at desc) as owner_ids 
FROM (select * from activities order by created_at desc) as act 
INNER JOIN users on users.id = act.owner_id 
WHERE (users.city_id = 1 and act.owner_type = 'User') 
GROUP BY trackable_type, recipient_id, recipient_type 
order by act.created_at desc 
limit 20 offset 0;

做解释 解释

我玩过这个查询很多,包括索引等。有没有办法优化这个查询?

4

5 回答 5

1

我认为你根本不需要offset 0,看起来你也可以没有子查询。如果您不使用users表中的字段,您可以使用in(或exists) 来明确:

select
    a.trackable_type, a.recipient_id, a.recipient_type,
    max(a.created_at) as max_created_at,
    group_concat(a.owner_id order by a.created_at desc) as owner_ids 
from activities as a
where
    a.owner_type = 'User' and
    a.owner_id in (select u.id from users as u where u.city_id = 1)
group by a.trackable_type, a.recipient_id, a.recipient_type
order by max_created_at desc
limit 20;

owner_type, owner_id同样对我来说,如果您在on 上创建索引activities(您的索引owner_id, owner_type不适用于您的查询)并在city_idon 上创建索引,您的查询肯定会获得性能提升users

于 2013-09-18T10:57:35.630 回答
1

MySQL 有时工作很奇怪,所以我会试一试。我假设 ID 是用户表上的主键。

SELECT 
    act.trackable_type, act.recipient_id, act.recipient_type,
max(act.created_at) as max_created_at,
    group_concat(act.owner_id order by act.created_at DESC) as owner_ids 
FROM  activities act 
WHERE act.owner_id in (select id from users where city_id = 1)
AND act.owner_Type = 'User'
GROUP BY trackable_type, recipient_id, recipient_type 
ORDER BY max_created_at
LIMIT 20
于 2013-09-19T09:05:26.497 回答
0

首先,我将开始使查询更具可读性:-)

您不需要带有 ORDER BY 的派生表,而是使用列列表而不是 ACT.*。

SELECT ACT.TRACKABLE_TYPE, ACT.RECIPIENT_ID, ACT.RECIPIENT_TYPE, MAX(ACT.CREATED_AT) AS max_created,
   GROUP_CONCAT(ACT.OWNER_ID ORDER BY ACT.CREATED_AT DESC) AS OWNER_IDS 
FROM ACTIVITIES AS ACT 
JOIN USERS ON USERS.ID = ACT.OWNER_ID 
WHERE (USERS.CITY_ID = 1 AND ACT.OWNER_TYPE = 'USER') 
GROUP BY ACT.TRACKABLE_TYPE, ACT.RECIPIENT_ID, ACT.RECIPIENT_TYPE
ORDER BY max_created DESC 
LIMIT 20 OFFSET 0;

当您将用户的 WHERE 条件移动到派生表中时,它可能会有所帮助:

SELECT ACT.TRACKABLE_TYPE, ACT.RECIPIENT_ID, ACT.RECIPIENT_TYPE, MAX(ACT.CREATED_AT) AS max_created,
   GROUP_CONCAT(ACT.OWNER_ID ORDER BY ACT.CREATED_AT DESC) AS OWNER_IDS 
FROM ACTIVITIES AS ACT 
JOIN (SELECT ID FROM USERS WHERE CITY_ID = 1) USERS 
  ON USERS.ID = ACT.OWNER_ID 
WHERE ACT.OWNER_TYPE = 'USER'
GROUP BY ACT.TRACKABLE_TYPE, ACT.RECIPIENT_ID, ACT.RECIPIENT_TYPE
ORDER BY max_created DESC 
LIMIT 20 OFFSET 0;
于 2013-09-16T09:05:47.110 回答
0

您能否告诉我们您的用户表的大小,例如以下查询的结果:

select count(id) from users WHERE users.city_id = 1;

如果这是一个小数字,我建议使用

SELECT act.trackable_type, act.recipient_id, act.recipient_type, max(act.created_at) as max_created_at,
    group_concat(act.owner_id order by act.created_at DESC) as owner_ids 
FROM  activities act 
WHERE act.owner_id in (select id from users where city_id = 1)
AND act.owner_Type = 'User'
GROUP BY trackable_type, recipient_id, recipient_type 
ORDER BY max_created_at
LIMIT 20

否则,使用join会更好

SELECT ACT.TRACKABLE_TYPE, ACT.RECIPIENT_ID, ACT.RECIPIENT_TYPE, MAX(ACT.CREATED_AT) AS max_created_at,
   GROUP_CONCAT(ACT.OWNER_ID ORDER BY ACT.CREATED_AT DESC) AS OWNER_IDS 
FROM ACTIVITIES ACT 
JOIN USERS ON (USERS.CITY_ID = 1 AND USERS.ID = ACT.OWNER_ID)
WHERE ACT.OWNER_TYPE = 'USER'
GROUP BY ACT.TRACKABLE_TYPE, ACT.RECIPIENT_ID, ACT.RECIPIENT_TYPE
ORDER BY max_created DESC 
LIMIT 20;
于 2013-09-23T06:21:33.253 回答
0

首先,这是一个非常棘手的查询,根据解释什么意思以及如何改进它,可以为开发人员职位构建一个有趣的面试 =)。

  1. MySQL 使用嵌套循环连接,这意味着当有一个连接时,MySQL 从一个表开始,并且对于表中的每个匹配行,循环遍历连接中第二个表中的相关行。

  2. 当您没有索引时,对于每一行,MySQL 都会在磁盘上获取在条件中使用的字段,并对另一个表中的每一行执行相同的操作。上磁盘既昂贵又耗时,最好从内存中获取信息,这样您就可以从索引中获取数据。

  3. 连接的顺序由 MySQL 优化器选择。但是您可以通过创建特殊索引来提示 MySQL(有时还会提示)。

  4. 当你做这样的事情时,你(select * from activities order by created_at desc)会将整个表加载到一个临时的未索引表中,这在任何情况下都不是一件好事。但最糟糕的是,MySQL 应该从表开始连接,否则它需要在嵌套循环中检查表的每一行的条件。

  5. 使用索引进行排序或分组(也需要排序)是什么意思?这意味着您按照索引的顺序读取数据。但是由于 MySQL 使用嵌套循环连接,因此只有当您排序的字段所在的表来自连接中的第一个表时,您才能利用索引进行排序。

  6. created_at字段不包含在group by子句中,这意味着您不关心从组中选择哪个(并且它们在组中可能相同)

  7. 在复杂的查询中,尤其是在那些有分页的查询中,通常最好只选择所需行的 id 并将其余字段的 id 回连接到表中(排序的数据越少,所需的速度越快)。

  8. 总而言之,我们需要activities使用索引从表开始连接,users在嵌套循环中连接并获取 id,然后返回连接到活动表以获取其余值。

因此,您需要相当长的有关活动的复合索引(owner_type, trackable_type, recipient_id, recipient_type, owner_id, created_at),以及可能很奢侈但需要 (id, city_id)的用户索引。

现在,将查询重写为:

SELECT *
FROM
  (SELECT a.id, group_concat(a.owner_id order by a.created_at desc) as owner_ids
   FROM activities a
   JOIN users u ON a.owner_id = u.id AND u.city_id = 1
   WHERE a.owner_type = 'User'
   GROUP BY trackable_type, recipient_id, recipient_type
   ORDER BY a.created_at desc
   limit 20 offset 0) as owners
JOIN activities a USING (id);

您应该查看 EXPLAIN 并可能在子查询中使用 STRAIGHT_JOIN 而不是 JOIN 以确保正确的连接顺序。

这个解决方案似乎需要资源,而且确实如此。但这应该是您后续实验的良好基准。您可能应该从引入一些其他字段进行分组开始(在索引中包含 varchar 255 效率不高,尤其是其中两个),因此您应该有一些足够的前缀,并且要么明确地将它们作为排序器引入,要么强制索引带前缀。您可能会创建一个特殊的 grouper 字段,该字段是(trackable_type,recipient_id,recipient_type)中的一个函数。这owner_type = 'User'也不是很好,最好比较整数等。

于 2013-09-23T11:41:02.867 回答