我假设用户可能在计费周期之前很久就加入了一个组,并且在计费周期内可能不会更改状态。这需要扫描整个表以构建一个成员表,如下所示:
create table membership (
UserId int not null,
GroupId int not null,
start datetime not null,
end datetime not null,
count int not null,
primary key (UserId, GroupId, end )
);
一旦正确填充,您想要的答案很容易获得:
set @sm = '2009-02-01';
set @em = date_sub( date_add( @sm, interval 1 month), interval 1 day);
# sum( datediff( e, s ) + 1 ) -- +1 needed to include last day in billing
select UserId,
GroupId,
sum(datediff( if(end > @em, @em, end),
if(start<@sm, @sm, start) ) + 1 ) as n
from membership
where start <= @em and end >= @sm
group by UserId, GroupId
having n >= 15;
扫描需要由光标执行(不会很快)。我们需要按 ActionDate 和 Action 对您的输入表进行排序,以便“加入”事件出现在“离开”事件之前。计数字段可帮助处理病理情况 - 会员资格在某个日期结束,然后在同一日期重新开始,并在同一日期再次结束,并在同一日期再次开始,等等。在这些情况下,我们为每个开始事件增加计数,并为每个结束事件减少计数。我们只会在结束事件将计数减至零时关闭会员资格。在填充成员表结束时,您可以查询 count 的值:关闭的成员应该有 count = 0,打开的成员(尚未关闭)应该有 count = 1。
游标查询是:
select UserID as _UserID, GroupID as _GroupID, Date(ActionDate) adate, Action from tbl
order by UserId, GroupId, Date(ActionDate), Action desc;
“Action desc”应该打破平局,以便如果有人在同一日期加入和离开组,则开始事件出现在结束事件之前。ActionDate 需要从日期时间转换为日期,因为我们对天单位感兴趣。
光标内的操作如下:
if (Action = 1) then
insert into membership
set start=ActionDate, end='2037-12-31', UserId=_UserId, GroupId=_GroupId, count=1
on duplicate key update set count = count + 1;
elsif (Action == -1)
update membership
set end= if( count=1, Actiondate, end),
count = count - 1
where UserId=_UserId and GroupId=_GroupId and end = '2037-12-31';
end if
我没有给你所需的游标定义的确切语法(你可以在 MySQL 手册中找到),因为完整的代码会掩盖这个想法。事实上,在应用程序中执行游标逻辑可能会更快——甚至可能在应用程序中构建成员资格详细信息。
编辑:这是实际代码:
create table tbl (
UserId int not null,
GroupId int not null,
Action int not null,
ActionDate datetime not null
);
create table membership (
UserId int not null,
GroupId int not null,
start datetime not null,
end datetime not null,
count int not null,
primary key (UserId, GroupId, end )
);
drop procedure if exists popbill;
delimiter //
CREATE PROCEDURE popbill()
BEGIN
DECLARE done INT DEFAULT 0;
DECLARE _UserId, _GroupId, _Action int;
DECLARE _adate date;
DECLARE cur1 CURSOR FOR
select UserID, GroupID, Date(ActionDate) adate, Action
from tbl order by UserId, GroupId, Date(ActionDate), Action desc;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
truncate table membership;
OPEN cur1;
REPEAT
FETCH cur1 INTO _UserId, _GroupId, _adate, _Action;
IF NOT done THEN
IF _Action = 1 THEN
INSERT INTO membership
set start=_adate, end='2037-12-31',
UserId=_UserId, GroupId=_GroupId, count=1
on duplicate key update count = count + 1;
ELSE
update membership
set end= if( count=1, _adate, end),
count = count - 1
where UserId=_UserId and GroupId=_GroupId and end = '2037-12-31';
END IF;
END IF;
UNTIL done END REPEAT;
CLOSE cur1;
END
//
delimiter ;
下面是一些测试数据:
insert into tbl values (1, 10, 1, '2009-01-01' );
insert into tbl values (1, 10, -1, '2009-01-02' );
insert into tbl values (1, 10, 1, '2009-02-03' );
insert into tbl values (1, 10, -1, '2009-02-05' );
insert into tbl values (1, 10, 1, '2009-02-05' );
insert into tbl values (1, 10, -1, '2009-02-05' );
insert into tbl values (1, 10, 1, '2009-02-06' );
insert into tbl values (1, 10, -1, '2009-02-06' );
insert into tbl values (2, 10, 1, '2009-02-20' );
insert into tbl values (2, 10, -1, '2009-05-30');
insert into tbl values (3, 10, 1, '2009-01-01' );
insert into tbl values (4, 10, 1, '2009-01-31' );
insert into tbl values (4, 10, -1, '2009-05-31' );
这是正在运行的代码和结果:
call popbill;
select * from membership;
+--------+---------+---------------------+---------------------+-------+
| UserId | GroupId | start | end | count |
+--------+---------+---------------------+---------------------+-------+
| 1 | 10 | 2009-01-01 00:00:00 | 2009-01-02 00:00:00 | 0 |
| 1 | 10 | 2009-02-03 00:00:00 | 2009-02-05 00:00:00 | 0 |
| 1 | 10 | 2009-02-06 00:00:00 | 2009-02-06 00:00:00 | 0 |
| 2 | 10 | 2009-02-20 00:00:00 | 2009-05-30 00:00:00 | 0 |
| 3 | 10 | 2009-01-01 00:00:00 | 2037-12-31 00:00:00 | 1 |
| 4 | 10 | 2009-01-31 00:00:00 | 2009-05-31 00:00:00 | 0 |
+--------+---------+---------------------+---------------------+-------+
6 rows in set (0.00 sec)
然后,检查 2 月 9 日出现的计费天数:
set @sm = '2009-02-01';
set @em = date_sub( date_add( @sm, interval 1 month), interval 1 day);
select UserId,
GroupId,
sum(datediff( if(end > @em, @em, end),
if(start<@sm, @sm, start) ) + 1 ) as n
from membership
where start <= @em and end >= @sm
group by UserId, GroupId;
+--------+---------+------+
| UserId | GroupId | n |
+--------+---------+------+
| 1 | 10 | 4 |
| 2 | 10 | 9 |
| 3 | 10 | 28 |
| 4 | 10 | 28 |
+--------+---------+------+
4 rows in set (0.00 sec)
可以这样做以仅扫描表以查找自上次运行以来的更改:
- 删除“截断成员资格”语句。
- 创建一个包含最后处理的时间戳的控制表
- 计算您要在此运行中包含的最后一个时间戳(我建议 max(ActionDate) 不好,因为可能会有一些乱序到达带有较早的时间戳。一个不错的选择是“00:00:00”今天早上,或每月第一天的“00:00:00”)。
- 将游标查询更改为仅包括上次运行日期(来自控制表)和计算的最后日期之间的 tbl 条目。
- 最后用计算的最后日期更新控制表。
如果你这样做,传递一个允许你从头开始重建的标志也是一个好主意 - 即。将控制表重置为时间的开始,并在运行通常的过程之前截断成员表。