sql-server - 如何使用 sql-server 从订单中计算重叠订阅天数

Question

我有一个订单表。我想计算特定日期每个用户的订阅天数（以基于集合的方式优先）。

create table #orders (orderid int, userid int, subscriptiondays int, orderdate date)
insert into #orders
    select 1, 2, 10, '2011-01-01'
    union 
    select 2, 1, 10, '2011-01-10'
    union 
    select 3, 1, 10, '2011-01-15'
    union 
    select 4, 2, 10, '2011-01-15'

declare @currentdate date = '2011-01-20'

--userid 1 is expected to have 10 subscriptiondays left
(since there is 5 left when the seconrd order is placed)
--userid 2 is expected to have 5 subscriptionsdays left

我确定这已经完成了，我只是不知道要搜索什么。很像一个运行总数？

因此，当我将 @currentdate 设置为 '2011-01-20' 时，我想要这个结果：

userid      subscriptiondays
1           10
2           5

当我将 @currentdate 设置为 '2011-01-25'

userid      subscriptiondays
1           5
2           0

当我将 @currentdate 设置为 '2011-01-11'

userid      subscriptiondays
1           9
2           0

谢谢！

score 4 · Accepted Answer

我认为您需要使用递归公用表表达式。

编辑：我还在下面添加了一个程序实现，而不是使用递归公用表表达式。我建议使用这种程序方法，因为我认为可能存在一些我包含的递归 CTE 查询可能无法处理的数据场景。

下面的查询为您提供的场景提供了正确答案，但您可能需要考虑一些额外的复杂场景并查看是否存在任何错误。

例如，我有一种感觉，如果您有多个先前的订单与后来的订单重叠，则此查询可能会失败。

with CurrentOrders (UserId, SubscriptionDays, StartDate, EndDate) as
(
    select
        userid,
        sum(subscriptiondays),
        min(orderdate),
        dateadd(day, sum(subscriptiondays), min(orderdate))
    from #orders
    where
        #orders.orderdate <= @currentdate
        -- start with the latest order(s)
        and not exists (
            select 1
            from #orders o2
        where
            o2.userid = #orders.userid
            and o2.orderdate <= @currentdate
            and o2.orderdate > #orders.orderdate
        )
    group by
        userid

    union all

    select
        #orders.userid,
        #orders.subscriptiondays,
        #orders.orderdate,
        dateadd(day, #orders.subscriptiondays, #orders.orderdate)
    from #orders
    -- join any overlapping orders
    inner join CurrentOrders on
        #orders.userid = CurrentOrders.UserId
        and #orders.orderdate < CurrentOrders.StartDate
        and dateadd(day, #orders.subscriptiondays, #orders.orderdate) > CurrentOrders.StartDate
)
select
    UserId,
    sum(SubscriptionDays) as TotalSubscriptionDays,
    min(StartDate),
    sum(SubscriptionDays) - datediff(day, min(StartDate), @currentdate) as RemainingSubscriptionDays
from CurrentOrders
group by
    UserId
;

Philip 提到了对公用表表达式的递归限制的担忧。下面是使用表变量和 while 循环的程序替代方案，我相信它可以完成同样的事情。

虽然我已验证此替代代码确实有效，但至少对于提供的示例数据而言，我很高兴听到任何人对这种方法的评论。好主意？馊主意？有什么需要注意的问题吗？

declare @ModifiedRows int

declare @CurrentOrders table
(
    UserId int not null,
    SubscriptionDays int not null,
    StartDate date not null,
    EndDate date not null
)

insert into @CurrentOrders
select
    userid,
    sum(subscriptiondays),
    min(orderdate),
    min(dateadd(day, subscriptiondays, orderdate))
from #orders
where
    #orders.orderdate <= @currentdate
    -- start with the latest order(s)
    and not exists (
        select 1
        from #orders o2
        where
            o2.userid = #orders.userid
            and o2.orderdate <= @currentdate
            -- there does not exist any other order that surpasses it
            and dateadd(day, o2.subscriptiondays, o2.orderdate) > dateadd(day, #orders.subscriptiondays, #orders.orderdate)
    )
group by
    userid

set @ModifiedRows = @@ROWCOUNT


-- perform an extra update here in case there are any additional orders that were made after the start date but before the specified @currentdate
update co set
    co.SubscriptionDays = co.SubscriptionDays + #orders.subscriptiondays
from @CurrentOrders co
inner join #orders on
    #orders.userid = co.UserId
    and #orders.orderdate <= @currentdate
    and #orders.orderdate >= co.StartDate
    and dateadd(day, #orders.subscriptiondays, #orders.orderdate) < co.EndDate


-- Keep attempting to update rows as long as rows were updated on the previous attempt
while(@ModifiedRows > 0)
begin
    update co set
        SubscriptionDays = co.SubscriptionDays + overlap.subscriptiondays,
        StartDate = overlap.orderdate
    from @CurrentOrders co
    -- join any overlapping orders
    inner join (
        select
            #orders.userid,
            sum(#orders.subscriptiondays) as subscriptiondays,
            min(orderdate) as orderdate
        from #orders
        inner join @CurrentOrders co2 on
            #orders.userid = co2.UserId
            and #orders.orderdate < co2.StartDate
            and dateadd(day, #orders.subscriptiondays, #orders.orderdate) > co2.StartDate
        group by
            #orders.userid
    ) overlap on
        overlap.userid = co.UserId

    set @ModifiedRows = @@ROWCOUNT
end

select
    UserId,
    sum(SubscriptionDays) as TotalSubscriptionDays,
    min(StartDate),
    sum(SubscriptionDays) - datediff(day, min(StartDate), @currentdate) as RemainingSubscriptionDays
from @CurrentOrders
group by
    UserId

EDIT2：我对上面的代码进行了一些调整以解决各种特殊情况，例如，如果恰好有两个用户的订单都在同一日期结束。

例如，将设置数据更改为以下会导致原始代码出现问题，我现在已经更正了：

insert into #orders
    select 1, 2, 10, '2011-01-01'
    union 
    select 2, 1, 10, '2011-01-10'
    union 
    select 3, 1, 10, '2011-01-15'
    union 
    select 4, 2, 6, '2011-01-15'
    union 
    select 5, 2, 4, '2011-01-17'

EDIT3：我做了一些额外的调整来解决其他特殊情况。特别是，之前的代码遇到了以下设置数据的问题，我现在已经更正了：

insert into #orders
    select 1, 2, 10, '2011-01-01'
    union 
    select 2, 1, 6, '2011-01-10'
    union 
    select 3, 1, 10, '2011-01-15'
    union 
    select 4, 2, 10, '2011-01-15'
    union 
    select 5, 1, 4, '2011-01-12'

score 0 · Accepted Answer

如果我的澄清评论/问题是正确的，那么您想使用 DATEDIFF：

DATEDIFF(dd, orderdate,  @currentdate)

score 0 · Accepted Answer

事实上，我们需要计算订阅天数的总和减去第一个订阅日期和@currentdate 的天数，例如：

select userid, 
       sum(subsribtiondays)-
       DATEDIFF('dd', 
                (select min(orderdate) 
                 from #orders as a 
                 where a.userid=userid),  @currentdate)
from #orders
where orderdate <= @currentdata
group by userid

score 0 · Accepted Answer

我对问题的解释：

在第 X 天，客户购买“跨度”订阅天数（即有效期为 N 天）
跨度从购买之日开始，适用于 X 到 X + (N - 1) 天......但见下文
如果客户在第一个跨度到期后购买了第二个跨度（或所有现有跨度到期后的任何新跨度），请重复该过程。（30 天前的 10 天单次购买对今天的第二次购买没有影响。）
如果客户在现有跨度仍然有效的情况下购买跨度，则新跨度适用于day immediately after end of current span(s)通过that date + (N – 1)
这是迭代的。如果客户在 1 月 1 日、1 月 2 日和 1 月 3 日购买 10 天跨度，则看起来像：

自 1 日起：1 月 1 日至 1 月 10 日

截至2日：1月1日至1月10日，1月11日至1月20日（有效，1月1日至1月20日）

3日起：1月1日至1月10日、1月11日至1月20日、1月21日至1月30日（有效，1月1日至1月30日）

如果这确实是问题所在，那么在 T-SQL 中解决它是一个可怕的问题。要确定给定购买的“有效跨度”，您必须按照购买顺序计算所有先前购买的有效跨度，因为总体累积效应。对于 1 个用户和 3 行来说，这是一个微不足道的问题，但对于拥有数十次购买的数千名用户（这大概是你想要的）来说，这并不简单。

我会这样解决它：

EffectiveDate将数据类型的列添加date到表中
构建一个一次性流程，逐个用户遍历每一行，逐个订购日期，并计算上面讨论的 EffectiveDate
修改用于插入数据的过程，以在创建新条目时计算 EffectiveDate。通过这种方式，您只需参考该用户最近的购买。
解决有关删除（取消？）或更新（错误设置？）订单的后续问题

我可能错了，但我看不出有任何方法可以使用基于集合的策略来解决这个问题。（递归 CTE 等可以工作，但它们只能递归到这么多级别，而且我们不知道这个问题的极限——更不用说你需要多久运行一次，或者它必须执行得有多好.) 我会关注并支持任何解决这个问题而不递归的人！

当然，这只适用于我对问题的理解是正确的。如果不是，请无视。

sql-server - 如何使用 sql-server 从订单中计算重叠订阅天数

4 回答 4

Related

Reference