1

这是我的查询:

SELECT SQL_BUFFER_RESULT SQL_BIG_RESULT users.id, users.email, 
        COUNT(av.user_id) AS article_views_count,
        COUNT(af.id) AS article_favorites_count,
        COUNT(lc.user_id) AS link_clicks_count,
        COUNT(ai.user_id) AS ad_impressions_count,
        COUNT(ac.user_id) AS ad_clicks_count
          FROM users
            LEFT JOIN article_views AS av     ON (av.user_id = users.id AND av.created_at >= '2012-11-28 00:00:00' AND av.created_at <= '2012-11-30 23:59:59')
            LEFT JOIN article_favorites AS af ON (af.user_id = users.id AND af.created_at >= '2012-11-28 00:00:00' AND af.created_at <= '2012-11-30 23:59:59')
            LEFT JOIN link_clicks AS lc       ON (lc.user_id = users.id AND lc.created_at >= '2012-11-28 00:00:00' AND lc.created_at <= '2012-11-30 23:59:59')
            LEFT JOIN ad_impressions AS ai    ON (ai.user_id = users.id AND ai.created_at >= '2012-11-28 00:00:00' AND ai.created_at <= '2012-11-30 23:59:59')
            LEFT JOIN ad_clicks AS ac         ON (ac.user_id = users.id AND ac.created_at >= '2012-11-28 00:00:00' AND ac.created_at <= '2012-11-30 23:59:59')
          GROUP BY users.id
          HAVING (article_views_count + article_favorites_count + link_clicks_count + ad_impressions_count + ad_clicks_count) > 0

一些统计数据可以为您提供背景信息:

  1. 用户:1,474,348 行
  2. article_views:32,603,637 行
  3. article_favorites:10,199 行
  4. 链接点击次数:4,258,901 行
  5. ad_impressions:66,758,573 行
  6. ad_clicks:324,125 行

加入的每个表在 user_id 和 created_at 上都有一个复合索引(按此顺序)。

我们正在运行 Mysql 5,每个表都是 MyISAM 引擎。

这是查询的解释:https ://gist.github.com/4197482

目标是仅返回在该时间段内有任何活动(查看、收藏、点击、展示、广告点击)的用户。

有什么想法可以优化这个坏男孩吗?

4

2 回答 2

1

您的查询似乎是一个分析查询,可以根据大量数据进行一些分析(因为它包含一个聚合函数和一个 GROUP BY 子句)。

为了提高此类查询的性能,您可以使用以下方式创建然后 JOIN 的物化视图结果:

CREATE TABLE my_view AS SELECT ... FROM ... JOIN ...

通过这样做,下一个查询将更加高效,因为 MySQL 只需要计算聚合

然后,您只需要实施一个策略来刷新表(例如通过时间戳)

另一种解决方案是将数据导入 DBMS,该 DBMS 旨在高效处理此类查询:面向列的数据库。例如,InfiniDB 是一个基于 MySQL 的开源 dbms,具有针对分析查询优化的存储引擎。

于 2012-12-03T20:16:11.537 回答
0

Try to split query to INNER JOIN with each table and combine them with UNION. Like

SELECT users.id, users.email, COUNT(av.user_id) AS article_views_count
FROM users
JOIN article_views AS av ON (av.user_id = users.id AND av.created_at >= '2012-11-28 00:00:00' AND av.created_at <= '2012-11-30 23:59:59')
GROUP BY users.id, users.email

UNION

....
于 2012-12-03T19:53:09.677 回答