3

因此,在善良的 SO 用户的一点帮助下,我最终得到了一个逻辑上正确的 MySQL 查询,用于我正在处理的一项任务:为用户允许的新闻项目检索一个按时间顺序排列的 id 列表,使用某些类型的分组项目过滤到该组的单个代表。(呸!)

剩下的明显问题是这个查询非常笨拙和缓慢 - 根据 CakePHP 的数据库调用调试打印输出,调整到 145000 毫秒,哎呀。

有没有一种明智的方法来驯服像这样的野兽,或者我应该承认我在这里咬得比我能咀嚼的更多,并寻找一种不那么笨重的方法来获得或多或少相似的结果?所有建议表示赞赏。

    SELECT DISTINCT Uid.id, Uid.type
    FROM (SELECT uids.id id, uids_uids.parent_id parent_id, uids.created date,
                 uids.type type
          FROM uids 
          JOIN uids_uids ON uids_uids.uid_id = uids.id
          JOIN aros_uids ON uids.id = aros_uids.uid_id
          JOIN uids_uids ParentUids ON uids_uids.parent_id = ParentUids.uid_id
          WHERE uids.type IN ('Document','Photo','Release','PreRelease',
                              'ArtworkResource','Event') 
            AND (uids.start_date IS NULL OR uids.start_date <= NOW())
            AND (uids.end_date IS NULL OR uids.end_date <= NOW())
            AND aros_uids.aro_id IN (3,2,86,1448)
          ) Uid
    JOIN (SELECT uids_uids.parent_id parent_id, MAX(uids.created) maxdate
          FROM uids JOIN uids_uids
          ON uids_uids.uid_id = uids.id
          GROUP BY uids_uids.parent_id, uids.type) T2
    ON Uid.parent_id = T2.parent_id AND Uid.date = T2.maxdate
    ORDER BY Uid.date DESC
    LIMIT 100

预计到达时间:

好的,作为第一遍,我将这些子选择转换为视图,所以现在查询看起来更易于管理

    SELECT DISTINCT Uid.id, Uid.type
    FROM UidView Uid
    JOIN UidView2 T2
    ON Uid.parent_id = T2.parent_id AND Uid.date = T2.maxdate
    WHERE Uid.aro_id IN (3,2,86,1448)
    ORDER BY Uid.date DESC
    LIMIT 100

这肯定有帮助,将 Cake 的估计查询时间(毫秒)从六位数减少到 2500 左右。绝对是一个好的开始!

4

1 回答 1

0

这是我会尝试的:

获取每个派生查询并EXPLAIN分别针对每个查询运行。正如评论所建议的,检查任何缺少索引的行并在需要时添加。发布您的EXPLAIN结果以获得任何帮助。所以

EXPLAIN SELECT uids.id id, uids_uids.parent_id parent_id, uids.created date, ....
EXPLAIN SELECT uids_uids.parent_id parent_id, MAX(uids.created) maxdate ....

如果添加索引没有帮助或帮助很大,那么首先将每个子查询放入临时表并对其应用索引:

CREATE TABLE temp_uid
SELECT uids.id id, uids_uids.parent_id parent_id, uids.created date,
             uids.type type
      FROM uids 
      JOIN uids_uids ON uids_uids.uid_id = uids.id
      JOIN aros_uids ON uids.id = aros_uids.uid_id
      JOIN uids_uids ParentUids ON uids_uids.parent_id = ParentUids.uid_id
      WHERE uids.type IN ('Document','Photo','Release','PreRelease',
                          'ArtworkResource','Event') 
        AND (uids.start_date IS NULL OR uids.start_date <= NOW())
        AND (uids.end_date IS NULL OR uids.end_date <= NOW())
        AND aros_uids.aro_id IN (3,2,86,1448);

CREATE TABLE temp_t2
SELECT uids_uids.parent_id parent_id, MAX(uids.created) maxdate
      FROM uids JOIN uids_uids
      ON uids_uids.uid_id = uids.id
      GROUP BY uids_uids.parent_id, uids.type;

JOIN那些桌子上:

SELECT DISTINCT Uid.id, Uid.type
FROM temp_uid AS Uid
JOIN temp_t2 AS T2 ON Uid.parent_id = T2.parent_id AND Uid.date = T2.maxdate
ORDER BY Uid.date DESC
LIMIT 100;

正如我所提到的,您可能必须添加索引,并且可能需要添加到临时表中的这些列:

ALTER TABLE temp_uid ADD INDEX parentDateIdx (parent_id, Uid.date);
ALTER TABLE temp_t2 ADD INDEX parentMaxDateIdx (parent_id, maxdate);

如果您需要刷新临时表,只需截断它们并对它们执行INSERT INTO temp_uid...SELECTand INSERT INTO temp_t2...SELECT,而不是CREATE...SELECT. 存储过程对此非常有用。

顺便说一句,按照我对每个临时表所做的方式执行 aCREATE TABLE temp_t2...SELECT可能不会创建最佳表结构,因此最好稍后修改创建或从头开始自己做。

于 2012-10-17T15:07:34.727 回答