2

有人可以帮助我如何从开始日期创建结束日期。

产品指一家公司进行测试,而产品与该公司在不同的日期进行多次测试,并记录测试日期以建立产品状态即(outcomeID)。我需要建立作为 testDate 的 StartDate 和作为下一行的开始日期的 EndDate。但是,如果多个连续测试导致相同的 OutcomeID,我只需要返回一行,其中包含第一个测试的 StartDate 和最后一个测试的结束日期。换句话说,如果结果ID 在几次连续测试中没有改变。这是我的数据集


DECLARE @ProductTests TABLE

( RequestID int not null, ProductID int not null, TestID int not null, TestDate datetime null, OutcomeID int ) insert into @ProductTests (RequestID ,ProductID ,TestID ,TestDate ,OutcomeID ) select 1,2,22,'2005-01-21',10 union all select 1,2,42,'2007-03-17',10 union all select 1,2,45,'2010-12-25',10 union all select 1,2,325,'2011-01-14',13 union all select 1,2,895,'2011-08-10',15 union all select 1,2,111,'2011-12-23',15 union all select 1,2,636,'2012-05-02',10 union all select 1,2,554,'2012-11-08',17

--select *来自@producttests


RequestID   ProductID   TestID    TestDate        OutcomeID
1               2           22    2005-01-21         10
1               2           42    2007-03-17         10
1               2           45    2010-12-25         10
1               2           325   2011-01-14         13
1               2           895   2011-08-10         15
1               2           111   2011-12-23         15
1               2           636   2012-05-02         10
1               2           554   2012-11-08         17
这就是我需要实现的目标。


RequestID ProductID  StartDate        EndDate           OutcomeID
1            2       2005-01-21       2011-01-14        10
1            2       2011-01-14       2011-08-10        13
1            2       2011-08-10       2012-05-02        15
1            2       2012-05-02       2012-11-08        10
1            2       2012-11-08       NULL              17

正如您从数据集中看到的,前三个测试(22、42 和 45)都导致 OutcomeID 10,所以在我的结果中,我只需要测试 22 的开始日期和测试 45 的结束日期,即测试 325 的开始日期。正如您在测试 636 中看到的,结果 ID 已从 15 变回 10,因此它也需要返回。

--这是我目前使用以下脚本设法实现的


select T1.RequestID,T1.ProductID,T1.TestDate AS StartDate
       ,MIN(T2.TestDate) AS EndDate ,T1.OutcomeID 
from   @producttests T1
left join @ProductTests T2 ON T1.RequestID=T2.RequestID 
and T1.ProductID=T2.ProductID and T2.TestDate>T1.TestDate

group by T1.RequestID,T1.ProductID ,T1.OutcomeID,T1.TestDate

order by T1.TestDate

结果:


RequestID   ProductID   StartDate   EndDate       OutcomeID
1                  2    2005-01-21  2007-03-17         10
1                  2    2007-03-17  2010-12-25         10
1                  2    2010-12-25  2011-01-14         10
1                  2    2011-01-14  2011-08-10         13
1                  2    2011-08-10  2011-12-23         15
1                  2    2011-12-23  2012-05-02         15
1                  2    2012-05-02  2012-11-08         10
1                  2    2012-11-08  NULL               17

4

2 回答 2

0

11 月 7 日,但仍未得到答复,所以这是我的解决方案不太漂亮但有效

我的提示是关于窗口、排名和聚合函数,如 row_number、rank、avg、sum 等。当你想编写 raports 并在 sql server 2012 中变得非常强大时,这些是必不可少的

我也使用过 CTE(公用表表达式),但它可以写成子查询或临时表

;with cte ( ida, requestid, productid, testid, testdate, outcomeid) as
(
-- select rows where the outcome id is changing 
select b.* from 
(select  ROW_NUMBER() over( partition by requestid, productid order by testDate) as id, * from #ProductTests)a 
right outer join 
(select  ROW_NUMBER() over(partition by requestid, productid order by testDate) as id, * from #ProductTests) b
on a.requestID = b.requestID and a.productID = b.productID and a.id +1  = b.id 
where 1=1 
--or a.id = 1
and a.outcomeid <> b.outcomeid or b.outcomeid is null or a.id is null
)
select --*
a.RequestID,a.ProductID,a.TestDate AS StartDate   ,MIN(b.TestDate) AS EndDate ,a.OutcomeID  
from  cte a left join cte b on a.requestid = b.requestid and a.productid = b.productid and a.testdate < b.testdate
group by a.RequestID,a.ProductID ,a.OutcomeID,a.TestDate
order by StartDate
于 2012-11-16T14:03:32.033 回答
0

实际上,您的问题似乎有两个问题。一种是如何对包含相同值的连续(基于特定标准)行进行分组。另一个是你的标题中实际拼出的那个,即如何使用下一行的 StartDate 作为当前行的 EndDate。

就个人而言,我会按照我提到的顺序解决这两个问题,所以我会首先解决分组问题。在这种情况下正确分组数据的一种方法是使用双重排名,如下所示:

WITH partitioned AS (
  SELECT
    *,
    grp = ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID            ORDER BY TestDate)
        - ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID, OutcomeID ORDER BY TestDate)
  FROM @ProductTests
)
, grouped AS (
  SELECT
    RequestID,
    ProductID,
    StartDate = MIN(TestDate),
    OutcomeID
  FROM partitioned
  GROUP BY
    RequestID,
    ProductID,
    OutcomeID,
    grp
)
SELECT *
FROM grouped
;

应该为您的数据样本提供以下输出:

RequestID  ProductID  StartDate   OutcomeID
---------  ---------  ----------  ---------
1          2          2005-01-21  10
1          2          2011-01-14  13
1          2          2011-08-10  15
1          2          2012-05-02  10
1          2          2012-11-08  17

显然,还缺少一件事,那就是EndDate,现在是关心它的正确时机。ROW_NUMBER()再次使用,对groupedCTE 的结果集进行排名,然后在将结果集与自身连接时使用连接条件中的排名(使用外连接):

WITH partitioned AS (
  SELECT
    *,
    grp = ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID            ORDER BY TestDate)
        - ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID, OutcomeID ORDER BY TestDate)
  FROM @ProductTests
)
, grouped AS (
  SELECT
    RequestID,
    ProductID,
    StartDate = MIN(TestDate),
    OutcomeID,
    rnk = ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID ORDER BY MIN(TestDate))
  FROM partitioned
  GROUP BY
    RequestID,
    ProductID,
    OutcomeID,
    grp
)
SELECT
  g1.RequestID,
  g1.ProductID,
  g1.StartDate,
  g2.StartDate AS EndDate,
  g1.OutcomeID
FROM grouped g1
LEFT JOIN grouped g2
  ON g1.RequestID = g2.RequestID
 AND g1.ProductID = g2.ProductID
 AND g1.rnk = g2.rnk - 1
;

您可以在 SQL Fiddle上尝试此查询,以验证它是否返回您所追求的输出。

于 2012-11-16T15:10:25.253 回答