0

我有以下简单的架构:

CREATE TABLE POSTS  (
    ID INT NOT NULL,
    DATE      DATE NOT NULL,
    [Other stuff omitted]
);

CREATE TABLE TOPICS (
    ID INT NOT NULL,
    [Other stuff omitted]
);

CREATE TABLE THETA (
    POST_ID INT NOT NULL,
    TOPIC_ID INT NOT NULL,
    WEIGHT FLOAT NOT NULL
);

我有一个查询来汇总所有帖子中的 THETA 中的 WEIGHT,按日期和主题 ID 分组:

SELECT THETA.TOPIC_ID as TopicID, POSTS.DATE as Date, SUM(THETA.WEIGHT) as Value
  FROM POSTS INNER JOIN THETA
  WHERE THETA.POST_ID=POSTS.ID
  GROUP BY YEAR(POSTS.DATE), MONTH(POSTS.DATE), TopicID;

这按预期工作,结果如下:

+---------+------------+---------------------+
| TopicID | Date       | Value               |
+---------+------------+---------------------+
|       0 | 2008-08-19 |   350.4930010139942 |
|       0 | 2008-09-18 |  1745.5010008439422 |
|       0 | 2008-10-03 |   1468.824001269415 |
|       0 | 2008-11-25 |   1079.579000659287 |
|       0 | 2008-12-11 |  1070.3860008455813 |
|       0 | 2009-01-24 |  1453.3730010837317 |
|       0 | 2009-02-20 |  1139.2920009773225 |
|       1 | 2008-08-19 |  288.09700035490096 |
|       1 | 2008-09-22 |  1307.5790000930429 |
|       1 | 2008-10-16 |  1050.1739999558777 |
|       1 | 2008-11-11 |   868.2280002105981 |
|       1 | 2008-12-18 |   897.6830000579357 |
|       1 | 2009-01-12 |  1148.5619999151677 |
|       1 | 2009-02-12 |   858.0710002686828 |
|       2 | 2008-08-19 |  415.83300026878715 |
...

但是,我想通过该月的帖子数来标准化价值。例如,如果该月有 100 个帖子2008-08-19,则第一个结果行的值为 3.50493,第八个结果行的值为 2.88097。挑战在于每个月的帖子数量都不同,所以我不太确定该怎么做。有任何想法吗?

4

2 回答 2

1

也许:

SELECT t.TOPIC_ID as TopicID, p.DATE as Date, SUM(t.WEIGHT)/s.Month_CT as Value
FROM POSTS p
JOIN THETA t
  ON t.POST_ID = p.ID
JOIN (SELECT YEAR(DATE) as Yr, MONTH(DATE) as Mnth, COUNT(ID) as Month_CT
        FROM POSTS
        GROUP BY YEAR(DATE), MONTH(DATE)
       )s
  ON    YEAR(p.DATE) = s.Yr
    AND MONTH(p.DATE) = s.Mnth
GROUP BY YEAR(p.DATE), MONTH(p.DATE), TopicID;
于 2013-09-25T19:39:20.833 回答
0
SELECT  t.topic_id TopicID,
        CONCAT(y, '-', m, '-01') AS Date
        SUM(t.weight) / cnt as NormalizedValue
FROM    (
        SELECT  YEAR(date) y,
                MONTH(date) m,
                COUNT(*) AS cnt
        FROM    posts
        GROUP BY
                y, m, cnt
        ) p
JOIN    posts p
ON      p.date >= '0000-01-01' + INTERVAL y YEAR + INTERVAL m - 1 MONTH
        AND p.date < '0000-01-01' + INTERVAL y YEAR + INTERVAL m MONTH
JOIN    theta t
ON      t.post_id = p.id
GROUP BY
        y, m, t.topic_id
于 2013-09-25T20:01:05.000 回答