2

我有三个表,其架构如下:

表:应用程序

| ID (bigint) | USERID (Bigint)|      START_TIME (datetime) | 
-------------------------------------------------------------
|  1          |        13     |         2013-05-03 04:42:55 | 
|  2          |        13     |         2013-05-12 06:22:45 |
|  3          |        13     |         2013-06-12 08:44:24 |    
|  4          |        13     |         2013-06-24 04:20:56 |       
|  5          |        13     |         2013-06-26 08:20:26 |       
|  6          |        13     |         2013-09-12 05:48:27 | 

表:主机

| ID (bigint) | APPID (Bigint)|         DEVICE_ID (Bigint)  | 
-------------------------------------------------------------
|  1          |        1      |                           1 | 
|  2          |        2      |                           1 |
|  3          |        1      |                           1 |    
|  4          |        3      |                           3 |       
|  5          |        1      |                           4 |      
|  6          |        2      |                           3 |

表: 用法

| ID (bigint) | APPID (Bigint)|             HOSTID (Bigint) |   Factor (varchar)    |  
-------------------------------------------------------------------------------------
|  1          |        1      |                           1 |               Low     | 
|  2          |        1      |                           3 |               High    | 
|  3          |        2      |                           2 |               Low     | 
|  4          |        3      |                           4 |               Medium  | 
|  5          |        1      |                           5 |               Low     | 
|  6          |        2      |                           2 |               Medium  | 

现在,如果 put 是用户 ID,我想获取过去 6 个月中每个“因素”月份的每个月(所有应用程序)的表行数

如果 DEVICE_ID 在一个月内出现多次(基于 START_TIME,基于加入的应用程序和主机),则仅考虑使用情况的最新行(基于应用程序、主机和使用情况的组合)来计算计数。

上述示例的查询示例输出应为:(对于输入用户 id=13)

| MONTH       | USAGE_COUNT   |               FACTOR        | 
-------------------------------------------------------------
|  5          |        0      |                 High        | 
|  6          |        0      |                 High        | 
|  7          |        0      |                 High        | 
|  8          |        0      |                 High        |       
|  9          |        0      |                 High        |       
|  10         |        0      |                 High        | 
|  5          |        2      |                 Low         | 
|  6          |        0      |                 Low         | 
|  7          |        0      |                 Low         | 
|  8          |        0      |                 Low         |       
|  9          |        0      |                 Low         |       
|  10         |        0      |                 Low         |
|  5          |        1      |                 Medium      | 
|  6          |        1      |                 Medium      | 
|  7          |        0      |                 Medium      | 
|  8          |        0      |                 Medium      |       
|  9          |        0      |                 Medium      |       
|  10         |        0      |                 Medium      |

这是如何计算的?

  1. 对于 2013 年 5 月 (05-2013) 月份,表格应用程序中有两个应用程序
  2. 在 Hosts 表中,这些应用与 device_id 的 1,1,1,4,3 相关联
  3. 本月(05-2013)对于device_id=1,start_time的最新值为:2013-05-12 06:22:45(来自表hosts,apps),所以在表Usage中寻找appid=2&hostid的组合=2,其中有两行,其中一个因子为 Low,另一个为 Medium,
  4. 对于本月(05-2013),对于 device_id=4,按照相同的程序,我们得到一个条目,即 0 Low
  5. 同样,计算所有值。

要通过查询获得过去 6 个月,我试图通过以下方式获得它:

SELECT MONTH(DATE_ADD(NOW(), INTERVAL aInt MONTH)) AS aMonth
    FROM
    (
        SELECT 0 AS aInt UNION SELECT -1 UNION SELECT -2 UNION SELECT -3 UNION SELECT -4 UNION SELECT -5
    ) 

请检查 sqlfiddle:http ://sqlfiddle.com/#!2/55fc2

4

1 回答 1

1

因为您正在进行的计算涉及多次相同的连接,所以我首先创建了一个视图。

CREATE VIEW `app_host_usage`
AS 
SELECT a.id "appid", h.id "hostid", u.id "usageid",
       a.userid, a.start_time, h.device_id, u.factor
  FROM apps a
  LEFT OUTER JOIN hosts h ON h.appid = a.id
  LEFT OUTER JOIN `usage` u ON u.appid = a.id AND u.hostid = h.id
  WHERE a.start_time > DATE_ADD(NOW(), INTERVAL -7 MONTH)

条件是存在的WHERE,因为我假设您不希望 2005 年 7 月和 2006 年 7 月在同一个计数中组合在一起。

有了这个视图,查询就变成了

SELECT months.Month, COUNT(DISTINCT device_id), factors.factor
FROM
  (
    -- Get the last six months
    SELECT (MONTH(NOW()) + aInt + 11) % 12 + 1 "Month" FROM
      (SELECT 0 AS aInt UNION SELECT -1 UNION SELECT -2 UNION SELECT -3 UNION SELECT -4 UNION SELECT -5) LastSix
  ) months
  JOIN
  ( 
    -- Get all known factors
    SELECT DISTINCT factor FROM `usage` 
  ) factors
  LEFT OUTER JOIN
  (
    -- Get factors for each device... 
    SELECT 
           MONTH(start_time) "Month", 
           device_id,
           factor
      FROM app_host_usage a
      WHERE userid=13 
        AND start_time IN (
          -- ...where the corresponding usage row is connected
          --    to an app row with the highest start time of the
          --    month for that device.
          SELECT MAX(start_time)
            FROM app_host_usage a2
            WHERE a2.device_id = a.device_id
            GROUP BY MONTH(start_time)
        )
     GROUP BY MONTH(start_time), device_id, factor

  ) usageids ON usageids.Month = months.Month 
            AND usageids.factor = factors.factor
GROUP BY factors.factor, months.Month
ORDER BY factors.factor, months.Month

这非常复杂,但我试图评论解释每个部分的作用。看到这个 sqlfiddle:http ://sqlfiddle.com/#!2/5c871/1/0

于 2013-10-06T22:28:25.957 回答