2

所以我在办公室有一张用户表,他们从一天开始,并在同一天或晚些时候结束。我需要找到用户连续 5 周至少去过一次办公室的位置,例如,用户 1 在过去 5 周内去过办公室 1。

这是我正在使用的一些示例数据:

DECLARE @visits table 
(
    UserId int,
    OfficeId int,
    Start datetime,
    [End] datetime
)

INSERT INTO @visits (UserId, OfficeId, Start, [End])
VALUES (1, 1, '2013-07-11', '2013-07-13'), 
       (1, 1, '2013-07-02', '2013-07-03'),
       (1, 1, '2013-06-26', '2013-06-28'),
       (1, 2, '2013-06-19', '2013-06-19'),
       (1, 1, '2013-06-17', '2013-06-17'),
       (1, 1, '2013-06-13', '2013-06-13'),
       (2, 1, '2013-07-09', '2013-07-10'),
       (2, 1, '2013-07-01', '2013-07-02'),
       (2, 1, '2013-06-27', '2013-06-28'),
       (2, 1, '2013-06-13', '2013-06-14'),
       (2, 1, '2013-06-04', '2013-06-04')

我应该只取回 UserId 1,因为他已经在办公室 1 工作了 5 周,用户 2 错过了一周,所以不应该回来。

这需要工作 12 周,但为了简单起见,我选择了 5 个。

到目前为止,我几乎已经有了这个,我只需要一种按连续周分组的方法,然后添加 Count > 4

SELECT *, 
        dateadd(day, -datepart(dw, V1.Start) + 1, V1.Start) MondayOfStart,
        DATEADD(day, 7 - DATEPART(dw, V1.[End]), V1.[End]) as SundayOfEnd,
        DATEADD(day, -1, dateadd(day, -datepart(dw, V1.Start) + 1, V1.Start)) StartLess1
FROM @visits V1
INNER JOIN @visits V2 ON 
        V2.UserId = V1.UserId AND
        V2.OfficeId = V1.OfficeId AND
        DATEADD(day, 7 - DATEPART(dw, V2.[End]), V2.[End]) = DATEADD(day, -1, dateadd(day, -datepart(dw, V1.Start) + 1, V1.Start))

编辑:我要测试的一些实际数据:

VALUES 
(2777, 2248, '2013-05-23 00:00:00.000', '2013-05-23 00:00:00.000'),
(2777, 2248, '2013-05-24 00:00:00.000', '2013-05-24 00:00:00.000'),
(2777, 2248, '2013-05-27 00:00:00.000', '2013-05-27 00:00:00.000'),
(2777, 2248, '2013-05-28 00:00:00.000', '2013-05-28 00:00:00.000'),
(2777, 2248, '2013-05-29 00:00:00.000', '2013-05-29 00:00:00.000'),
(2777, 2248, '2013-05-30 00:00:00.000', '2013-05-30 00:00:00.000'),
(2777, 2248, '2013-05-31 00:00:00.000', '2013-05-31 00:00:00.000'),
(2777, 2248, '2013-06-03 00:00:00.000', '2013-06-03 00:00:00.000'),
(2777, 2248, '2013-06-04 00:00:00.000', '2013-06-04 00:00:00.000'),
(2777, 2248, '2013-06-05 00:00:00.000', '2013-06-05 00:00:00.000'),
(2777, 2248, '2013-06-06 00:00:00.000', '2013-06-06 00:00:00.000'),
(2777, 2248, '2013-06-07 00:00:00.000', '2013-06-07 00:00:00.000'),
(2777, 2248, '2013-06-10 00:00:00.000', '2013-06-10 00:00:00.000'),
(2777, 2248, '2013-06-11 00:00:00.000', '2013-06-11 00:00:00.000'),
(2777, 2248, '2013-06-12 00:00:00.000', '2013-06-12 00:00:00.000'),
(2777, 2248, '2013-06-13 00:00:00.000', '2013-06-13 00:00:00.000'),
(2777, 2248, '2013-06-14 00:00:00.000', '2013-06-14 00:00:00.000'),
(2777, 2248, '2013-06-17 00:00:00.000', '2013-06-17 00:00:00.000'),
(2777, 2248, '2013-06-18 00:00:00.000', '2013-06-18 00:00:00.000'),
(2777, 2248, '2013-06-19 00:00:00.000', '2013-06-19 00:00:00.000'),
(2777, 2248, '2013-06-20 00:00:00.000', '2013-06-20 00:00:00.000'),
(2777, 2248, '2013-06-21 00:00:00.000', '2013-06-21 00:00:00.000'),
(2777, 2248, '2013-06-24 00:00:00.000', '2013-06-24 00:00:00.000'),
(2777, 2248, '2013-06-25 00:00:00.000', '2013-06-25 00:00:00.000'),
(2777, 2248, '2013-06-26 00:00:00.000', '2013-06-26 00:00:00.000'),
(2777, 2248, '2013-06-27 00:00:00.000', '2013-06-27 00:00:00.000'),
(2777, 2248, '2013-06-28 00:00:00.000', '2013-06-28 00:00:00.000')
4

2 回答 2

2

这是一个如何使用递归 CTE语句完成此操作的简单示例。

我不确定您究竟需要什么作为“输出”,所以我正在显示用户及其上周的开始和结束日期。您可以自由地重新设计它以满足您的需求:

DECLARE @visits TABLE 
(
     [UserId] INT
    ,[OfficeId] INT
    ,[Start] DATETIME
    ,[End] DATETIME
)

INSERT INTO @visits (UserId, OfficeId, Start, [End])
VALUES (1, 1, '2013-07-11', '2013-07-13'), 
       (1, 1, '2013-07-02', '2013-07-03'),
       (1, 1, '2013-06-26', '2013-06-28'),
       (1, 2, '2013-06-19', '2013-06-19'),
       (1, 1, '2013-06-17', '2013-06-17'),
       (1, 1, '2013-06-13', '2013-06-13'),
       (2, 1, '2013-07-09', '2013-07-04'),
       (2, 1, '2013-07-01', '2013-07-02'),
       (2, 1, '2013-06-27', '2013-06-28'),
       (2, 1, '2013-06-13', '2013-06-14'),
       (2, 1, '2013-06-04', '2013-06-04')

;WITH DataSource ([UserId], [OfficeId], [Start], [End], [Level]) AS
(
    SELECT AnchorMember.[UserId]
          ,AnchorMember.[OfficeId]
          ,DATEADD(DAY, -(DATEPART(WEEKDAY, AnchorMember.[Start])-1), AnchorMember.[Start]) 
          ,DATEADD(DAY, 7-(DATEPART(WEEKDAY, AnchorMember.[End])), AnchorMember.[End]) 
          ,1 AS [Level]
    FROM @visits AS AnchorMember
    UNION ALL
    SELECT RecursiveMember.[UserId]
          ,RecursiveMember.[OfficeId]
          ,DATEADD(DAY, -(DATEPART(WEEKDAY, RecursiveMember.[Start])-1), RecursiveMember.[Start]) 
          ,DATEADD(DAY, 7-(DATEPART(WEEKDAY, RecursiveMember.[End])), RecursiveMember.[End]) 
          ,DS.[Level] + 1
    FROM @visits AS RecursiveMember
    INNER JOIN DataSource DS
        ON RecursiveMember.[UserId] = DS.[UserId]
        AND RecursiveMember.[OfficeId] = DS.[OfficeId]
    -- This is the important part: The "Week StartDate" + 1 day should  be eaual to previous "Week EndDate"
    WHERE DATEADD(DAY, -(DATEPART(WEEKDAY, RecursiveMember.[End])-1), RecursiveMember.[End]) = DATEADD(DAY, 8-(DATEPART(WEEKDAY, DS.[Start])), DS.[Start])
)
SELECT [UserId]
      ,[OfficeId]
      ,[Start]
      ,[End]
FROM DataSource
WHERE [Level] = 5
ORDER BY [UserId]
        ,[OfficeId]
        ,[Start]
        ,[End]

我们在表达式中所做的可以通过以下步骤来描述:

  1. 选择所有记录并定义它们的星号和结束 WEEK 天数。
  2. 对于上述每条记录,检查是否有相同用户 ID 和办公室 ID 的其他记录,但从下周开始。如果存在此类记录,请返回。
  3. 做第 1 步,但对于第 2 步的结果,直到第 2 步中没有找到记录

[级别] 列显示当前记录之前的连续周数。因此,如果您需要它工作 12 周,请在最后的“WHERE”子句中将“5”替换为“12”。

编辑:

由于您只需要至少一组连续周的不同用户,我们可以减少列数,如下所示:

;WITH DataSource ([UserId], [OfficeId], [Start], [Level]) AS
(
    SELECT AnchorMember.[UserId]
          ,AnchorMember.[OfficeId]
          ,DATEADD(DAY, -(DATEPART(WEEKDAY, AnchorMember.[Start])-1), AnchorMember.[Start]) 
          ,1 AS [Level]
    FROM @visits AS AnchorMember
    UNION ALL
    SELECT RecursiveMember.[UserId]
          ,RecursiveMember.[OfficeId]
          ,DATEADD(DAY, -(DATEPART(WEEKDAY, RecursiveMember.[Start])-1), RecursiveMember.[Start]) 
          ,DS.[Level] + 1
    FROM @visits AS RecursiveMember
    INNER JOIN DataSource DS
        ON RecursiveMember.[UserId] = DS.[UserId]
        AND RecursiveMember.[OfficeId] = DS.[OfficeId]
    WHERE DATEADD(DAY, -(DATEPART(WEEKDAY, RecursiveMember.[Start])-1), RecursiveMember.[Start]) = DATEADD(DAY, 7, DS.[Start]) 
)
SELECT DISTINCT [UserId]
               ,[OfficeId]
FROM DataSource
WHERE [Level] = 5
ORDER BY [UserId]
        ,[OfficeId]

由于我无法访问您的数据,我无法确定是什么导致了延迟,所以这无济于事。

如果在添加group子句之前查询性能良好,则可以尝试将 CTE 的结果插入临时表或表变量中,然后将其“分组”。

如果这是解决性能问题,您可以发布查询执行计划。

于 2013-07-12T11:23:43.593 回答
1

我使用了一个临时表来解决这个问题。我在其中填写了过去 5 周的星期一和星期日的日期(从日期“2013-07-16”开始)。

然后我通过日期加入并计算给定 User-Office 组合的记录数。

请注意,我需要调整您的周一和周日公式,可能是因为不同的区域设置。请根据您的需要进行调整。

另请注意,我确实为 Visits 表使用了 TABLE 而不是表变量,但这不会造成破坏。

create table #Weeks (
  Monday Date,
  Sunday Date
)
GO

DECLARE @date DATE = CONVERT(DATE, '2013-07-16')

INSERT INTO #Weeks(Monday, Sunday)
VALUES 
  (dateadd(DAY, -datepart(dw, DATEADD(WEEK, -1, @date)) + 2, DATEADD(WEEK, -1, @date)), 
   dateadd(DAY, -datepart(dw, DATEADD(WEEK, -1, @date)) + 8, DATEADD(WEEK, -1, @date))),
  (dateadd(DAY, -datepart(dw, DATEADD(WEEK, -2, @date)) + 2, DATEADD(WEEK, -2, @date)), 
   dateadd(DAY, -datepart(dw, DATEADD(WEEK, -2, @date)) + 8, DATEADD(WEEK, -2, @date))),
  (dateadd(DAY, -datepart(dw, DATEADD(WEEK, -3, @date)) + 2, DATEADD(WEEK, -3, @date)), 
   dateadd(DAY, -datepart(dw, DATEADD(WEEK, -3, @date)) + 8, DATEADD(WEEK, -3, @date))),
  (dateadd(DAY, -datepart(dw, DATEADD(WEEK, -4, @date)) + 2, DATEADD(WEEK, -4, @date)), 
   dateadd(DAY, -datepart(dw, DATEADD(WEEK, -4, @date)) + 8, DATEADD(WEEK, -4, @date))),
  (dateadd(DAY, -datepart(dw, DATEADD(WEEK, -5, @date)) + 2, DATEADD(WEEK, -5, @date)), 
   dateadd(DAY, -datepart(dw, DATEADD(WEEK, -5, @date)) + 8, DATEADD(WEEK, -5, @date)))
GO

SELECT UserId, OfficeId, COUNT(*) AS WeeksAttended
FROM Visits JOIN #Weeks ON Start <= Sunday AND [End] >= Monday
GROUP BY UserId, OfficeId
HAVING COUNT(*) = 5
GO

DROP TABLE #Weeks

这返回

USERID  OFFICEID    WEEKSATTENDED
1           1           5
于 2013-07-12T11:23:30.187 回答