1

Really struggling with a query that uses groupwise maximum, any help would be much appreciated. Feel free to point out if I should not be using groupwise maximum.

I have two tables application and email, one application can have many emails. What I'm trying to do in my query is get all details from application and join the email table (I'm actually only getting a foreign key from email for another table which indicates if the email has been replied to), getting the last email sent based on the max(timestamp), which is why I am trying to use groupwise maximum.

I've tried this, but it seems to make a duplicate of each row:

SELECT  `application` . * ,  `email1`.`student_email_id` AS  `email_student_email_id` 
FROM  `application` 
LEFT JOIN (
  SELECT MAX( tstamp ) AS tstamp, id, student_email_id, application_id
  FROM email
  GROUP BY id, student_email_id, application_id
) AS email1 ON  `email1`.`application_id` =  `application`.`id` 
WHERE  `application`.`status` =  'returned'

This is what seemed to work at first but is causing issues now and I'm sure it's pretty sloppy code:

select `application`.*, `email1`.`student_email_id` as `email_student_email_id`
from `application` 
left join (
  select student_email_id, max(tstamp) as tstamp, application_id
  from email 
  group by application_id, tstamp
  order by tstamp desc
  limit 1) as email1 on `email1`.`application_id` = `application`.`id` 
where `application`.`status` = 'returned'

Any guidance would be highly appreciated, if you need to see more code please ask! Thanks.

Further clarity if needed for my db set up and what should be happening (left out unimportant parts):

Application Table
+----+----------+
| id |  status  |
+----+----------+
|  1 | returned |
+----+----------+

Email Table
+----+------------+----------------+------------------+
| id |   tstamp   | application_id | student_email_id |
+----+------------+----------------+------------------+
|  1 | 2014-12-26 |              1 | NULL             |
|  2 | 2014-12-27 |              1 | 3                |
+----+------------+----------------+------------------+

The query should be showing the following:

+----+----------+------------------------+
| id |  status  | email_student_email_id |
+----+----------+------------------------+
|  1 | returned |                      3 |
+----+----------+------------------------+

First solution above shows duplicates of everything (maybe I'm nearly there) and second one shows null for the joined table columns, although I'm sure it did work at one stage or in isolation at least!

4

1 回答 1

4

您正在Email为每个 distinct 查找表中的最新行application_id

你得到的子查询不太正确。这是你如何得到它。

SELECT s.application_id, e.student_email_id
  FROM email e
  JOIN (
         SELECT MAX(tstamp) tstamp, application_id
           FROM email
          GROUP BY application_id
       ) s ON e.application_id = s.application_id AND e.tstamp = s.tstamp

还有另一种方法可以做到这一点,这可能更有效。id如果该列是自动增量列,它将起作用。

SELECT s.application_id, e.student_email_id
  FROM email e
  JOIN (
         SELECT MAX(id) id
           FROM email
          GROUP BY application_id
       ) s ON e.id = s.id

这些前面的子查询中的任何一个都为每个 application_id 获取最新的 student_email_id。第二个使用 JOIN 仅提取每个 application_id 的最高 id 号,并使用该 id 查找最新的 student_email_id。

你的子查询是这样的。它没有得到你所希望的。

 SELECT MAX( tstamp ) AS tstamp, id, student_email_id, application_id /*wrong*/
   FROM email
  GROUP BY id, student_email_id, application_id 

您按 id 分组。这意味着您将获得所有详细信息行。那不是你想要的。甚至这个

 SELECT MAX( tstamp ) AS tstamp, student_email_id, application_id  /*wrong*/
   FROM email
  GROUP BY student_email_id, application_id 

将为每个 application_id 值提供多条记录。

所以你需要的查询是:

SELECT  application.* ,  email1.student_email_id AS  email_student_email_id 
  FROM  application 
  LEFT JOIN (
              SELECT s.application_id, e.student_email_id
                FROM email e  
                JOIN (
                       SELECT MAX(id) id
                         FROM email
                        GROUP BY application_id
                     ) s ON e.id = s.id
           ) AS email1 ON  email1.application_id =  application.id 
 WHERE application.status =  'returned'

当您设计这样的查询时,从最里面的子查询开始从内到外进行测试是明智的。

于 2014-12-26T23:32:23.440 回答