3

我有一个日期列表。

我需要能够选择彼此 6 个月内最多的记录。

然后是下一个最大数量的记录,依此类推,直到所有记录都被选中。

这是数据

1  19-Oct-2007
2  03-Dec-2007
3  16-Oct-2009
4  26-Oct-2009
5  30-Oct-2009
6  01-Nov-2009
7  16-Nov-2009
8  30-Nov-2009
9  11-Dec-2009
10  25-Dec-2009
11  01-Jan-2010
12  21-Jan-2010
13  27-Jan-2010
14  28-Jan-2010
15  28-Jan-2010
16  12-Feb-2010
17  12-Feb-2010
18  27-Feb-2010
19  09-Mar-2010
20  22-Mar-2010
21  26-Mar-2010
22  01-Apr-2010
23  22-Oct-2010
24  15-Oct-2011
25  18-Oct-2011
26  26-Oct-2011
27  16-Nov-2011
28  18-Nov-2011
29  19-Nov-2011
30  26-Nov-2011
31  29-Nov-2011
32  29-Nov-2011
33  30-Nov-2011
34  06-Dec-2011
35  16-Dec-2011
36  17-Dec-2011
37  20-Dec-2011
38  28-Dec-2011
39  01-Jan-2012
40  01-Jan-2012
41  09-Jan-2012
42  13-Jan-2012
43  27-Jan-2012
44  01-Feb-2012
45  23-Feb-2012
46  29-Feb-2012
47  01-Mar-2012
48  01-Mar-2012
49  01-Mar-2012
50  02-Mar-2012
51  04-Mar-2012
52  04-Mar-2012
53  05-Mar-2012
54  05-Mar-2012
55  17-Mar-2012
56  23-Mar-2012
57  24-Mar-2012
58  01-Apr-2012
59  03-Apr-2012
60  04-Apr-2012

一种可能的解决方案是选择

  • 记录 24-60(他们在 172 天内)
  • 记录 23(其不在前/后日期的 6 个月内)
  • 记录 3-22(他们在 167 天内)
  • 记录 1-2 (他们在 45 天内彼此)

(我从最大的日期开始,向后选择。这可能不是最佳解决方案)

4

4 回答 4

2

以下是该问题的迭代方法,目前我没有比这更好的建议了。不过,它应该可以工作:

WITH ranked AS (
  SELECT *, rnk = ROW_NUMBER() OVER (ORDER BY Date DESC)
  FROM data
),
marked AS (
  SELECT
    rnk,
    Date,
    GroupDate = date
  FROM ranked
  WHERE rnk = 1
  UNION ALL
  SELECT
    r.rnk,
    r.Date,
    GroupDate = CASE
      WHEN m.GroupDate > DATEADD(MONTH, 6, r.Date) THEN r.Date
      ELSE m.GroupDate
    END
  FROM ranked r
  INNER JOIN marked m ON r.rnk = m.rnk + 1
)
SELECT
  MinDate     = MIN(Date),
  MaxDate     = MAX(Date),
  [RowCount]  = COUNT(*),
  RangeLength = DATEDIFF(DAY, MIN(Date), MAX(Date))
FROM marked
GROUP BY
  GroupDate
ORDER BY
  GroupDate

那是,

  1. 最后一个日期被用于范围检查和组标记。

  2. 处理后续(之前)日期,直到遇到标记半年以上。

  3. 找到的日期成为新的组标记,该过程从步骤 1 继续,直到没有更多行。

在继续迭代之前,对行进行排名。但是,如果您有一列保证包含唯一的连续值而没有间隙,您可以使用该列而不是排名数字。

以下是它为原始帖子中的示例提供的结果:

MinDate     MaxDate     RowCount     RangeLength
----------  ----------  -----------  -----------
2007-10-19  2007-12-03  2            45
2009-10-16  2010-04-01  20           167
2010-10-22  2010-10-22  1            0
2011-10-15  2012-04-04  37           172

整个脚本,包括设置,都可以在 SQL Fiddle 上找到并使用。

于 2012-04-12T11:11:23.017 回答
1

我使用了自己的测试数据,这是非常复杂的东西。使用光标可能更容易处理。但我不是游标的忠实粉丝。我已经尽力了:

declare @t table(record int, date datetime)
insert @t values(1,'19-Oct-2007'),
(2,'03-Dec-2007'),
(3,'2-may-2009'),
(4,'16-Oct-2009'),
(5,'26-Oct-2009'),
(6,'30-Oct-2009'),
(7,'01-Nov-2009'),
(8,'16-Nov-2009'),
(9,'30-Nov-2009'),
(10,'11-Dec-2009'),
(11,'11-Dec-2010'),
(12,'11-Dec-2010'),
(13,'11-Dec-2010')

;with a as
(
  select datediff(day, t1.date, t2.date) daysapart, 
  row_number() over (order by count desc) rn,
  b.count, 
         t1.record fromrecord, 
         t2.record torecord
  from @t t1
  join @t t2
  on t1.date <= t2.date 
     and dateadd(month, 6, t1.date) > t2.date 
     and t1.record <= t2.record
  cross apply (select count(*) count from @t where record between t1.record and t2.record) b
)
, b as
(
    select * from a where not exists 
    (select 1 from a b where (a.fromrecord between b.fromrecord and b.torecord
      or a.torecord between b.fromrecord and b.torecord)
      and a.rn > b.rn and not exists(select 1 from a c where 
      (b.fromrecord between c.fromrecord and c.torecord
      or b.torecord between c.fromrecord and c.torecord)
      and b.rn > c.rn))
)
select count, fromrecord, torecord, daysapart from b

结果:

count       fromrecord  torecord    daysapart
----------- ----------- ----------- -----------
7           4           10          56
3           11          13          0
2           1           2           45
1           3           3           0
于 2012-04-12T08:31:05.747 回答
1
select d1.date, count(*)
from dates as d1 with (nolock) 
join dates as d2 with (nolock) 
on datediff(mm,d2.date,d1.date) < 6 
group by d1.date  
order by count(*) desc 
于 2012-04-11T22:15:01.240 回答
0

我考虑过

  1. 6 个月 = 180 天
  2. 如果日期之间的差异 > 180 天,那么它们将在列表中出现。

尝试这个 :

create table #list (id int, dt datetime  )
-- insert you data into #list

select s1.id as ID_1, s1.dt as Date_2 , s2.id as ID_2, s2.dt as Date_2 
,abs( datediff(day, s2.dt, s1.dt) ) diff_in_days
from #list s1 , #list s2 
order by  case when abs(datediff(day, s2.dt, s1.dt) ) > 180 then 1
else  abs(datediff(day, s2.dt, s1.dt)) end  desc 
于 2012-04-12T05:36:35.237 回答