0

Here is my query:

EXPLAIN SELECT Count(1), 
       user_id, 
       type 
FROM   (SELECT e.user_id, 
               e.type, 
               Max(r.date_time) last_seen, 
               e.date_time      event_time 
        FROM   events e 
               JOIN requests r 
                 ON e.user_id = r.user_id 
                    AND e.type IN( 3, 5, 6 ) 
        GROUP  BY e.user_id, 
                  e.date_time, 
                  e.type 
        HAVING last_seen < event_time) x 
GROUP  BY user_id, 
          type

Also here is the result of EXPLAIN :

enter image description here

Also here is the result of that subquery (x) EXPLAIN:

enter image description here

See? Much optimal. So the issue is grouping here. Any idea how can I make that query better?


EDIT: We need two tables:

  1. requests table -- A new row will inserted inside it for each users request. So, the last (biggest) determines user's last time been online in our website somewhat.

  2. events table -- A new row will be inserted inside it for each answer, comment.

We're talking about a Q/A website. All we're trying to do is "sending an email to the users who got a new comment/answer after their last time being online in our website".

4

4 回答 4

2

您需要在表上使用适当的索引来匹配 WHERE 子句和 Order by 以帮助优化。

table      index on...
events     ( type, user_id, date_time )
requests   ( user_id, date_time ) 

我什至可能会建议稍微调整查询。
改变你的

AND e.type IN( 3, 5, 6 ) 

WHERE e.type IN( 3, 5, 6 ) 

因为“e.Type”基于查询的主表,与请求表的实际 JOIN 无关。连接应该代表实际列以限定表之间。

建议编辑问题。我可能会提供另一种选择。在“lastRequest”日期/时间字段的用户表中添加一列。然后,每当为该用户输入请求时,更新用户表中的字段。您不需要保留子查询 max() 来找出何时。这可能会将您的查询简化为... 随着您的请求表变大,您的查询时间也会变大。通过直接查看已知的最新请求的用户表 ONCE,您就有了答案。查询 10,000 个用户,或 200 万个请求……您选择通过 :)

select 
      u.user_id,
      e.type,
      count(*) CountPerType,
      min( e.date_time ) firstEventDateAfterUsersLastRequest
   from
      user u
         join events e 
            on u.user_id = e.user_id
           AND e.type in ( 3, 5, 6 )
           AND e.date_time > u.lastRequest
   group by
      u.user_id,
      e.type

因此,您的加入已经为每个用户提供了一个基础日期/时间,您只需查找在该人最后一次请求某些内容之后进入的那些记录(因此进行跟进)。

然后,要在用户表中准备新列,您只需更新每个用户的 max( request.date_time )。

如果一个人在 11 月 27 日之前处于活动状态,并且在此之后对 3 种不同的事件类型有 5 次响应,那么您仍然会在 11 月 27 日的日期收到该人,但其他人可能有更新或较旧的“latestRequest”日期。

只是一个可选的想法..

于 2018-12-03T14:53:33.337 回答
1

我会像这样重写查询:

select user_id, type, count(*)
from (select e.user_id, e.type, e.date_time, 
             (select max(r.date_time)
              from requests r
              where r.user_id = e.user_id
              ) as last_seen 
       from events e 
       where e.type  in ( 3, 5, 6 ) 
      ) er
where last_seen < date_time
group by user_id, type;

然后,我想确定 和 上有requests(user_id, date_time)索引events(type, user_id, date_time)

于 2018-12-03T16:32:00.037 回答
1

http://sqlfiddle.com/#!9/c73878/1

ALTER TABLE `events` ADD INDEX e_type (type);
ALTER TABLE `events` ADD INDEX user_time (user_id, date_time);
ALTER TABLE requests ADD INDEX user_time (user_id, date_time);

SELECT  COUNT(*),
        e.user_id, 
        e.type
FROM `events` e 
JOIN  (
  SELECT user_id, Max(r.date_time) last_seen
  FROM requests r 
  GROUP BY user_id
) r
ON e.user_id = r.user_id 
   AND e.date_time > r.last_seen
WHERE e.type IN( 3, 5, 6 ) 
GROUP  BY e.user_id,  
       e.type 
于 2018-12-03T15:30:34.287 回答
0

看看这是否得到“正确”的答案:

SELECT  COUNT(DISTINCT(e.date_time),
        e.user_id, e.type
    FROM  events e
    JOIN  requests r  ON  e.user_id = r.user_id
                     AND  e.type IN( 3, 5, 6 )
    GROUP BY  e.user_id, e.type
    HAVING  MAX(r.date_time) < e.event_time

索引:

e:  INDEX(type)   -- may be useful (depends on cardinality)
r:  INDEX(user_id, date_time)  -- in this order
于 2018-12-03T21:58:29.847 回答