mysql - Count the number of rows in 30 day bins

Question

Each row in my table has a date time stamp, and I wish to query the database from now, to count how many rows are in the last 30 days, the 30 days before that and so on. Until there is a 30 day bin going back to the start of the table.

I have successfully carried out this query by using Python and making several queries. But I'm almost certain that it can be done in one single MySQL query.

score 3 · Accepted Answer

如果你只需要计算至少有一行的间隔，你可以使用这个：

select
  datediff(curdate(), `date`) div 30 as block,
  count(*) as rows_per_block
from
  your_table
group by
  block

这也显示了开始日期和结束日期：

select
  datediff(curdate(), d) div 30 as block,
  date_sub(curdate(),
           INTERVAL (datediff(curdate(), `date`) div 30)*30 DAY) as start_block,
  date_sub(curdate(),
           INTERVAL (1+datediff(curdate(), `date`) div 30)*30-1 DAY) as end_block,
  count(*)
from your_table
group by block

但如果您还需要显示所有间隔，您可以使用这样的解决方案：

select
  num,
  date_sub(curdate(),
           INTERVAL (num+1)*30-1 DAY) as start_block,
  date_sub(curdate(),
           INTERVAL num*30 DAY) as end_block,
  count(`date`)
from
  numbers left join your_table
  on `date` between date_sub(curdate(),
           INTERVAL (num+1)*30-1 DAY)  and
  date_sub(curdate(),
           INTERVAL num*30 DAY)
where num<=(datediff(curdate(), (select min(`date`) from your_table) ) div 30)
group by num

但这要求您numbers已经准备好表格，或者在此处查看小提琴以获取没有数字表格的解决方案。

score 3 · Accepted Answer

没有存储过程、临时表、只有一个查询，以及给定日期列索引的高效执行计划：

select

  subdate(
    '2012-12-31',
    floor(dateDiff('2012-12-31', dateStampColumn) / 30) * 30 + 30 - 1
  ) as "period starting",

  subdate(
    '2012-12-31',
    floor(dateDiff('2012-12-31', dateStampColumn) / 30) * 30
  ) as "period ending",

  count(*)

from
  YOURTABLE
group by floor(dateDiff('2012-12-31', dateStampColumn) / 30);

除了这个咒语之外，这里发生的事情应该很明显：

floor(dateDiff('2012-12-31', dateStampColumn) / 30)

该表达式出现多次，其计算结果为 30 天前的周期dateStampColumn数。dateDiff返回以天为单位的差值，将其除以 30 以在 30 天的周期内得到它，然后将其全部输入以floor()将其舍入为整数。一旦我们有了这个数字，我们就可以GROUP BY了，然后我们做一些数学运算，将这个数字转换回该时期的开始和结束日期。

如果您愿意，请替换'2012-12-31'为。now()以下是一些示例数据：

CREATE TABLE YOURTABLE
    (`Id` int, `dateStampColumn` datetime);

INSERT INTO YOURTABLE
    (`Id`, `dateStampColumn`)
VALUES
    (1, '2012-10-15 02:00:00'),
    (1, '2012-10-17 02:00:00'),
    (1, '2012-10-30 02:00:00'),
    (1, '2012-10-31 02:00:00'),
    (1, '2012-11-01 02:00:00'),
    (1, '2012-11-02 02:00:00'),
    (1, '2012-11-18 02:00:00'),
    (1, '2012-11-19 02:00:00'),
    (1, '2012-11-21 02:00:00'),
    (1, '2012-11-25 02:00:00'),
    (1, '2012-11-25 02:00:00'),
    (1, '2012-11-26 02:00:00'),
    (1, '2012-11-26 02:00:00'),
    (1, '2012-11-24 02:00:00'),
    (1, '2012-11-23 02:00:00'),
    (1, '2012-11-28 02:00:00'),
    (1, '2012-11-29 02:00:00'),
    (1, '2012-11-30 02:00:00'),
    (1, '2012-12-01 02:00:00'),
    (1, '2012-12-02 02:00:00'),
    (1, '2012-12-15 02:00:00'),
    (1, '2012-12-17 02:00:00'),
    (1, '2012-12-18 02:00:00'),
    (1, '2012-12-19 02:00:00'),
    (1, '2012-12-21 02:00:00'),
    (1, '2012-12-25 02:00:00'),
    (1, '2012-12-25 02:00:00'),
    (1, '2012-12-26 02:00:00'),
    (1, '2012-12-26 02:00:00'),
    (1, '2012-12-24 02:00:00'),
    (1, '2012-12-23 02:00:00'),
    (1, '2012-12-31 02:00:00'),
    (1, '2012-12-30 02:00:00'),
    (1, '2012-12-28 02:00:00'),
    (1, '2012-12-28 02:00:00'),
    (1, '2012-12-30 02:00:00');

结果：

period starting     period ending   count(*)
2012-12-02          2012-12-31      17
2012-11-02          2012-12-01      14
2012-10-03          2012-11-01      5

期间端点包括在内。

在SQL Fiddle中玩这个。

有一点潜在的愚蠢之处在于，任何匹配行为零的 30 天期间都不会包含在结果中。如果您可以将其与一个时期表结合起来，则可以将其消除。然而，MySQL 没有像 PostgreSQL 的generate_series()这样的东西，所以你必须在你的应用程序中处理它或者尝试这个聪明的 hack。

score 2 · Accepted Answer

尝试这个：

SELECT 
  DATE_FORMAT(t1.`Date`, '%Y-%m-%d'),
  COUNT(t2.Id)
FROM 
(
  SELECT SUBDATE(CURDATE(), ID) `Date`
  FROM
  (
    SELECT  t2.digit * 10 + t1.digit + 1 AS id
    FROM         TEMP AS t1
    CROSS JOIN TEMP AS t2
  ) t 
  WHERE Id <= 30 
) t1
LEFT JOIN YOURTABLE t2 ON DATE(t1.`Date`) = DATE(t2.dateStampColumn)
GROUP BY t1.`Date`;

SQL 小提琴演示

但是，您需要Temp像这样创建一个临时表：

CREATE TABLE TEMP 
(Digit int);
INSERT INTO Temp VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9);

score 0 · Accepted Answer

Could you please try the following:

SELECT Count(*)
FROM
  yourtable
where
  dateColumn between Now() and Now() - Interval 30 Day

It needs some looping, for a better answer to isolote all 30 days intervals going back. As you also need a 30 day interval between min(Date) in the table and the last loop date :) Or to the least another table that carries the dates of each 30 day interval, and then join.

Here is getting count just by each calendar month. Not exactly what you need.

SELECT
  extract(month from datecolumn),
  count(*)
FROM
  yourtable
GROUP BY
  extract(month from datecolumn);

Given a thought to my latter comment and Stefan's comment, here is a long code yet with proper resutls. Based on my own sample data and compatible with MYSQL with interval. If you need to use with SQL Server please use DateADD or quivalent function.

SQLFIDDLE

Sample data:

ID_MAIN  FIELD1  FILTER
----------------------------------------
1        red     August, 05 2012 00:00:00+0000
2        blue    September, 15 2012 00:00:00+0000
3        pink    September, 20 2012 00:00:00+0000
4        blue    September, 27 2012 00:00:00+0000
5        blue    October, 02 2012 00:00:00+0000
6        blue    October, 16 2012 00:00:00+0000
7        blue    October, 22 2012 00:00:00+0000
8        pink    November, 12 2012 00:00:00+0000
9        pink    November, 28 2012 00:00:00+0000
10       pink    December, 01 2012 00:00:00+0000
11       pink    December, 08 2012 00:00:00+0000
12       pink    December, 22 2012 00:00:00+0000

Query:

set @i:= 0;
SELECT MIN(filter) INTO @mindt
FROM MAIN
;
select
  count(a.id_main),
  y.dateInterval,
  (y.dateInterval - interval 29 day) as lowerBound
from
  main a join (
    SELECT date_format(Now(),'%Y-%m-%d') as dateInterval
    from dual
    union all
    select x.dateInterval
    from (
      SELECT
        date_format(
          DATE(DATE_ADD(Now(),
                        INTERVAL @i:=@i-29 DAY)),'%Y-%m-%d') AS dateInterval
      FROM Main, (SELECT @i:=0) r
      HAVING datediff(dateInterval,@mindt) >= 30
      order by dateInterval desc) as x) as y
  on a.filter <= y.dateInterval 
     and a.filter > (y.dateInterval - interval 29 day)
group by y.dateInterval
order by y.dateInterval desc
;

Results:

COUNT(A.ID_MAIN)    DATEINTERVAL    LOWERBOUND
----------------------------------------------
2                   2012-12-30  2012-12-01
3                   2012-12-01  2012-11-02
2                   2012-11-02  2012-10-04
4                   2012-10-04  2012-09-05

score 0 · Accepted Answer

创建一个存储过程以按 30 天计算行数。

首先运行此过程，然后在要生成数据时调用相同的过程。

DELIMITER $$

DROP PROCEDURE IF EXISTS `sp_CountDataByDays`$$

CREATE DEFINER=`root`@`localhost` PROCEDURE `sp_CountDataByDays`()
BEGIN 
    CREATE TEMPORARY TABLE daterange (
            id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, 
            fromDate DATE, 
            toDate DATE, 
            PRIMARY KEY (`id`)
    ); 

    SELECT DATEDIFF(CURRENT_DATE(), dteCol) INTO @noOfDays 
    FROM yourTable ORDER BY dteCol LIMIT 1;

    SET @counter = -1;
    WHILE (@noOfDays > @counter) DO 
        INSERT daterange (toDate, fromDate) 
        VALUES (DATE_SUB(CURRENT_DATE(), INTERVAL @counter DAY), DATE_SUB(CURRENT_DATE(), INTERVAL @counter:=@counter + 30 DAY));
    END WHILE;

    SELECT d.id, d.fromdate, d.todate, COUNT(d.id) rowcnt 
    FROM daterange d  
    INNER JOIN yourTable a ON a.dteCol BETWEEN d.fromdate AND d.todate 
    GROUP BY d.id;

    DROP TABLE daterange;
END$$

DELIMITER ;

然后调用程序：

CALL sp_CountDataByDays();

你得到如下输出：

ID  From Date   To Date     Row Count
1   2012-12-06  2013-01-05  17668
2   2012-11-06  2012-12-06  2845
3   2012-10-07  2012-11-06  2276
4   2012-09-07  2012-10-07  4561
5   2012-08-08  2012-09-07  5415
6   2012-07-09  2012-08-08  8954
7   2012-06-09  2012-07-09  4387
8   2012-05-10  2012-06-09  7911
9   2012-04-10  2012-05-10  7935
10  2012-03-11  2012-04-10  2566

mysql - Count the number of rows in 30 day bins

5 回答 5

SQL 小提琴演示

Related

Reference