5

假设我们有这样一张桌子:

declare @periods table (
    s date, 
    e date,
    t tinyint
);

具有按开始日期(s)排序的无间隔的日期间隔

insert into @periods values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);

所有日期间隔都有不同的类型 (t)。

需要组合相同类型的日期间隔,它们不会被其他类型的间隔打破(所有间隔都按开始日期排序)。

因此结果表应如下所示:

      s     |      e     |  t
------------|------------|-----
 2013-01-01 | 2013-01-02 |  3
 2013-01-02 | 2013-01-05 |  1
 2013-01-05 | 2013-01-08 |  2
 2013-01-08 | 2013-01-09 |  1

任何想法如何在没有光标的情况下做到这一点?


我有一个可行的解决方案:

declare @periods table (
    s datetime primary key clustered, 
    e datetime,
    t tinyint,
    period_number int   
);

insert into @periods (s, e, t) values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);

declare @t tinyint = null;  
declare @PeriodNumber int = 0;
declare @anchor date;

update @periods
    set  period_number = @PeriodNumber, 
    @PeriodNumber = case
                        when @t <> t
                            then  @PeriodNumber + 1
                        else
                            @PeriodNumber
                    end,
    @t = t,
    @anchor = s
option (maxdop 1);

select 
    s = min(s),
    e = max(e),
    t = min(t)
from 
    @periods    
group by 
    period_number
order by 
    s;

但我怀疑我是否可以依赖 UPDATE 语句的这种行为?

我使用 SQL Server 2008 R2。


编辑:

感谢 Daniel 和这篇文章:http ://www.sqlservercentral.com/articles/T-SQL/68467/

我发现上述解决方案中遗漏了三件重要的事情:

  1. 表上必须有聚集索引
  2. 必须有锚变量和聚集列的调用
  3. 更新语句应该由一个处理器执行,即没有并行性

我已根据这些规则更改了上述解决方案。

4

5 回答 5

5

由于您的范围是连续的,因此问题本质上变成了问题。如果您有一个标准来帮助您区分具有相同t值的不同序列,您可以使用该标准对所有行进行分组,然后MIN(s), MAX(e)对每个组进行分组。

获得这种标准的一种方法是使用两次ROW_NUMBER调用。考虑以下查询:

SELECT
  *,
  rnk1 = ROW_NUMBER() OVER (               ORDER BY s),
  rnk2 = ROW_NUMBER() OVER (PARTITION BY t ORDER BY s)
FROM @periods
;

对于您的示例,它将返回以下集合:

s           e           t   rnk1  rnk2
----------  ----------  --  ----  ----
2013-01-01  2013-01-02  3   1     1
2013-01-02  2013-01-04  1   2     1
2013-01-04  2013-01-05  1   3     2
2013-01-05  2013-01-06  2   4     1
2013-01-06  2013-01-07  2   5     2
2013-01-07  2013-01-08  2   6     3
2013-01-08  2013-01-09  1   7     3

rnk1和排名的有趣之rnk2处在于,如果您从另一个中减去一个,您将获得与 一起t唯一标识具有相同 的行的每个不同序列的值t

s           e           t   rnk1  rnk2  rnk1 - rnk2
----------  ----------  --  ----  ----  -----------
2013-01-01  2013-01-02  3   1     1     0
2013-01-02  2013-01-04  1   2     1     1
2013-01-04  2013-01-05  1   3     2     1
2013-01-05  2013-01-06  2   4     1     3
2013-01-06  2013-01-07  2   5     2     3
2013-01-07  2013-01-08  2   6     3     3
2013-01-08  2013-01-09  1   7     3     4

知道了这一点,您可以轻松地应用分组和聚合。这是最终查询的样子:

WITH partitioned AS (
  SELECT
    *,
    g = ROW_NUMBER() OVER (               ORDER BY s)
      - ROW_NUMBER() OVER (PARTITION BY t ORDER BY s)
  FROM @periods
)
SELECT
  s = MIN(s),
  e = MAX(e),
  t
FROM partitioned
GROUP BY
  t,
  g
;

如果您愿意,可以在 SQL Fiddle上使用此解决方案。

于 2013-02-24T21:44:05.663 回答
2

这个怎么样?

declare @periods table (
    s datetime primary key, 
    e datetime,
    t tinyint,
    s2 datetime
);

insert into @periods (s, e, t) values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);

update @periods set s2 = s;

while @@ROWCOUNT > 0
begin
    update p2 SET s2=p1.s
    from @periods p1
    join @PERIODS P2 ON p2.t = p1.t AND p2.s2 = p1.e;
end

select s2 as s, max(e) as e, min(t) as t
from @periods
group by s2
order by s2;
于 2013-02-19T18:54:47.443 回答
2

避免更新和游标的可能解决方案应该是使用公用表表达式...

像这样...

    declare @periods table (
        s date, 
        e date,
        t tinyint
    );


    insert into @periods values
    ('2013-01-01' , '2013-01-02', 3),
    ('2013-01-02' , '2013-01-04', 1),
    ('2013-01-04' , '2013-01-05', 1),
    ('2013-01-05' , '2013-01-06', 2),
    ('2013-01-06' , '2013-01-07', 2),
    ('2013-01-07' , '2013-01-08', 2),
    ('2013-01-08' , '2013-01-09', 1);

    with cte as ( select 0 as n
                        ,p.s as s
                        ,p.e as e
                        ,p.t
                        ,case when p2.s is null then 1 else 0 end fl_s
                        ,case when p3.e is null then 1 else 0 end fl_e
                  from @periods p
                  left outer join @periods p2
                  on p2.e = p.s
                  and p2.t = p.t
                  left outer join @periods p3
                  on p3.s = p.e
                  and p3.t = p.t

                  union all 

                  select  n+1 as n
                        , p2.s as s
                        , p.e as e
                        ,p.t
                        ,case when not exists(select * from @periods p3 where p3.e =p2.s and p3.t=p2.t) then 1 else 0 end as fl_s
                        ,p.fl_e as fl_e
                  from cte p
                  inner join @periods p2
                  on p2.e = p.s
                  and p2.t = p.t
                  where p.fl_s = 0

                  union all 

                  select  n+1 as n
                        , p.s as s
                        , p2.e as e
                        ,p.t
                        ,p.fl_s as fl_s
                        ,case when not exists(select * from @periods p3 where p3.s =p2.e and p3.t=p2.t) then 1 else 0 end as fl_e
                  from cte p
                  inner join @periods p2
                  on p2.s = p.e
                  and p2.t = p.t
                  where p.fl_s = 1
                  and p.fl_e = 0
    )
    ,result as (select s,e,t,COUNT(*) as count_lines
                 from cte
                 where fl_e = 1
                 and fl_s = 1
                 group by s,e,t
                 )
    select * from result
    option(maxrecursion 0)

结果集达到...

    s           e           t   count_lines
    2013-01-01  2013-01-02  3   1
    2013-01-02  2013-01-05  1   2
    2013-01-05  2013-01-08  2   3
    2013-01-08  2013-01-09  1   1
于 2013-02-19T19:27:13.657 回答
1

万岁!我找到了适合我的解决方案,并且无需迭代即可完成

with cte1 as (   
    select s, t  from @periods
    union all
    select max(e), null from @periods
),
cte2 as (
    select rn = row_number() over(order by s), s, t from cte1   
),
cte3 as (
    select 
        rn = row_number() over(order by a.rn),
        a.s,
        a.t 
    from 
        cte2 a 
        left join cte2 b on a.rn = b.rn + 1 and a.t = b.t
    where
        b.rn is null 
)
select 
    s = a.s, 
    e = b.s, 
    a.t  
from 
    cte3 a 
    inner join cte3 b on b.rn = a.rn + 1;

感谢大家分享您的想法和解决方案!


细节:

cte1 返回日期链及其的类型:

s          t
---------- ----
2013-01-01 3
2013-01-02 1
2013-01-04 1
2013-01-05 2
2013-01-06 2
2013-01-07 2
2013-01-08 1
2013-01-09 NULL  -- there is no type *after* the last date

ct2 只需将行号添加到上述结果中:

 rn       s       t
---- ----------  ----
 1    2013-01-01  3
 2    2013-01-02  1
 3    2013-01-04  1
 4    2013-01-05  2
 5    2013-01-06  2
 6    2013-01-07  2
 7    2013-01-08  1
 8    2013-01-09  NULL

如果我们在不带where条件的情况下输出 cte3 中查询的所有字段,我们会得到以下结果:

select * from cte2 a left join cte2 b on a.rn = b.rn + 1 and a.t = b.t;

rn       s        t       rn      s          t
---- ----------  ----    ------ ----------  ----
1    2013-01-01  3       NULL   NULL        NULL
2    2013-01-02  1       NULL   NULL        NULL
3    2013-01-04  1       2      2013-01-02  1
4    2013-01-05  2       NULL   NULL        NULL
5    2013-01-06  2       4      2013-01-05  2
6    2013-01-07  2       5      2013-01-06  2
7    2013-01-08  1       NULL   NULL        NULL
8    2013-01-09  NULL    NULL   NULL        NULL

对于重复类型的日期,结果右侧有值。所以我们可以删除右侧存在值的所有行。

所以 cte3 返回:

rn    s           t
----- ----------  ----
1     2013-01-01  3
2     2013-01-02  1
3     2013-01-05  2
4     2013-01-08  1
5     2013-01-09  NULL

请注意,由于删除了一些行,因此在 rn 序列中存在一些间隙,因此我们必须再次对其重新编号。

从这里只剩下一件事 - 将日期转换为句点:

select 
    s = a.s, 
    e = b.s, 
    a.t  
from 
    cte3 a 
    inner join cte3 b on b.rn = a.rn + 1;

我们得到了所需的结果:

s          e          t
---------- ---------- ----
2013-01-01 2013-01-02 3
2013-01-02 2013-01-05 1
2013-01-05 2013-01-08 2
2013-01-08 2013-01-09 1
于 2013-02-20T07:17:04.710 回答
0

这是您的解决方案,表格上有不同的数据..

    declare @periods table (
        s datetime primary key, 
        e datetime,
        t tinyint,
        period_number int   
    );

    insert into @periods (s, e, t) values
    ('2013-01-01' , '2013-01-02', 3),
    ('2013-01-02' , '2013-01-04', 1),
    ('2013-01-04' , '2013-01-05', 1),
    ('2013-01-05' , '2013-01-06', 2),
    ('2013-01-09' , '2013-01-10', 2),
    ('2013-01-10' , '2013-01-11', 1);

    declare @t tinyint = null;  
    declare @PeriodNumber int = 0;

    update @periods
        set  period_number = @PeriodNumber, 
        @PeriodNumber = case
                            when @t <> t
                                then  @PeriodNumber + 1
                            else
                                @PeriodNumber
                        end,
        @t = t;

    select 
        s = min(s),
        e = max(e),
        t = min(t)
    from 
        @periods    
    group by 
        period_number
    order by 
        s;

哪里有差距

 ('2013-01-05' , '2013-01-06', 2),
 --and
 ('2013-01-09' , '2013-01-10', 2),

您的解决方案结果集是..

    s           e           t
    2013-01-01  2013-01-02  3
    2013-01-02  2013-01-05  1
    2013-01-05  2013-01-10  2
    2013-01-10  2013-01-11  1

不是这样的结果集..??

    s           e           t
    2013-01-01  2013-01-02  3
    2013-01-02  2013-01-05  1
    2013-01-05  2013-01-06  2
    2013-01-09  2013-01-10  2
    2013-01-10  2013-01-11  1

也许我确实误解了你的问题的规则......

于 2013-02-19T19:18:53.773 回答