sql-server - 我可以在没有游标的 SQL 函数中执行此操作吗？

Question

我正在研究时间表数据库。简单来说，TimesheetEntries 表有四列

ID int (identity, 1, 1)
StaffID int
ClockedIn datetime
ClockedOut datetime

我被要求写一份报告，按日期范围显示员工出勤率。用户输入一个日期，报告会输出所有参加工作人员的打卡和下班时间以及他们在现场的持续时间。

然而，这就是棘手的地方，工作人员有时会在短时间内离开站点，并且报告需要忽略这些（当他们离开站点不到 2 小时时）。

所以，让我们假设以下条目

ID  StaffID  ClockedIn    ClockedOut
1   4        0900         1200
2   4        1330         1730
3   5        0900         1200
4   5        1409         1730
5   4        1830         1930

报告的输出应该是

StaffID  ClockedIn    ClockedOut
4        0900         1930
5        0900         1200     
5        1409         1730

有没有任何方法可以在没有游标甚至游标内嵌套的游标的情况下执行此操作（这就是我现在所在的位置！）？我们在这里不是在谈论庞大的数据集，性能也不是真正的问题（它是一个报告，而不是一个生产系统），但如果我可以避免它们，我真的不喜欢游标。

谢谢

爱德华

score 2 · Accepted Answer

我确信有更简单的方法可以做到这一点，但我能够通过几个 CTE 来完成它：

declare @TimeSheetEntries table
    (
    ID int identity not null primary key,
    StaffID int not null,
    ClockedIn datetime not null,
    ClockedOut datetime not null
    );

insert into @TimeSheetEntries
    (
    StaffID,
    ClockedIn,
    ClockedOut
    )
select
    4,
    '2012-01-01 09:00:00',
    '2012-01-01 12:00:00'
union all select
    4,
    '2012-01-01 13:30:00',
    '2012-01-01 17:30:00'
union all select
    5,
    '2012-01-01 09:00:00',
    '2012-01-01 12:00:00'
union all select
    5,
    '2012-01-01 14:09:00',
    '2012-01-01 17:30:00'
union all select 
    4, 
    '2012-01-01 18:30:00', 
    '2012-01-01 19:30:00'       
;
with MultiCheckins as (
    select distinct
        StaffID,
        cast(cast(cast(ClockedIn as float) as int) as datetime) as TimeSheetDate,
        rank() over (
            partition by StaffID, 
            cast(cast(cast(ClockedIn as float) as int) as datetime)
            order by ClockedIn
            ) as ordinal,
        ClockedIn,
        ClockedOut
    from @TimeSheetEntries
), Organized as
(
select
    row_number() over (
        order by
            mc.StaffID,
            mc.TimeSheetDate,
            mc.ClockedIn,
            mc.ClockedOut
            ) as RowID,
    mc.StaffID,
    mc.TimeSheetDate,
    case
        when datediff(hour, coalesce(mc3.ClockedOut, mc.ClockedIn), mc.ClockedIn) >= 2
            then mc.ClockedIn 
        else coalesce(mc3.ClockedIn, mc.ClockedIn)
        end as ClockedIn,
    case 
        when datediff(hour, mc.ClockedOut, coalesce(mc2.ClockedIn, mc.ClockedOut)) < 2
            then coalesce(mc2.ClockedOut, mc.ClockedOut)
        else mc.ClockedOut
        end as ClockedOut
from
    MultiCheckins as mc
left outer join
    MultiCheckIns as mc3
        on mc3.StaffID = mc.StaffID
        and mc3.TimeSheetDate = mc.TimeSheetDate
        and mc3.ordinal =  mc.ordinal - 1
left outer join 
    MultiCheckIns as mc2
        on mc2.StaffID = mc.StaffID
        and mc2.TimeSheetDate = mc.TimeSheetDate
        and mc2.ordinal = mc.ordinal + 1
)
select distinct
    o.StaffID,
    o.ClockedIn,
    o.ClockedOut
from Organized as o
where
    not exists (
        select null from Organized as o2
        where o2.RowID <> o.RowID
        and o2.StaffID = o.StaffID
        and 
            (
            o.ClockedIn between o2.ClockedIn and o2.ClockedOut
            and o.ClockedOut between o2.ClockedIn and o2.ClockedOut
            )
        )

score 1 · Accepted Answer

我使用了上面 Jeremy 回复中的数据，但以完全不同的方式解决了这个问题。这使用递归 CTE，我认为它需要 SQL Server 2005。它准确地报告结果（我相信）并且还报告时间范围内记录的打卡次数和总休息时间（可能超过 120 分钟，因为限制只是每个场外时间少于两个小时）。

declare @TimeSheetEntries table 
    ( 
    ID int identity not null primary key, 
    StaffID int not null, 
    ClockedIn datetime not null, 
    ClockedOut datetime not null 
    ); 

insert into @TimeSheetEntries 
    ( 
    StaffID, 
    ClockedIn, 
    ClockedOut 
    ) 
select 
    4, 
    '2012-01-01 09:00:00', 
    '2012-01-01 12:00:00' 
union all select 
    4, 
    '2012-01-01 13:30:00', 
    '2012-01-01 17:30:00' 
union all select 
    5, 
    '2012-01-01 09:00:00', 
    '2012-01-01 12:00:00' 
union all select 
    5, 
    '2012-01-01 14:09:00', 
    '2012-01-01 17:30:00'
union all select
    4,
    '2012-01-01 18:30:00', 
    '2012-01-01 19:30:00';


WITH ClockData AS
(
    SELECT ID, StaffID, ClockedIn, ClockedOut AS EffectiveClockout, 1 AS NumClockIns, 0 AS MinutesOff
    FROM @TimeSheetEntries ts
    WHERE NOT EXISTS (SELECT ID FROM @TimeSheetEntries tsWhere WHERE tsWhere.ClockedOut BETWEEN DATEADD(hour, -2, ts.ClockedIn) AND ts.ClockedIn)

    UNION ALL

    SELECT cd.ID, cd.StaffID, cd.ClockedIn, ts.ClockedOut AS EffectiveClockout, cd.NumClockIns + 1 AS NumClockIns, cd.MinutesOff + DateDiff(minute, cd.EffectiveClockout, ts.ClockedIn) AS MinutesOff
    FROM @TimeSheetEntries ts
    INNER JOIN ClockData cd
        ON ts.StaffID = cd.StaffID
            AND ts.ClockedIn BETWEEN cd.EffectiveClockout AND dateadd(hour, 2, cd.EffectiveClockout)
)
SELECT *
FROM ClockData cd
WHERE NumClockIns = (SELECT MAX(NumClockIns) FROM ClockData WHERE ID = cd.ID)

这将返回：

ID   StaffID   ClockedIn                 EffectiveClockout        NumClockIns   MinutesOff
3    5         2012-01-01 09:00:00.000   2012-01-01 12:00:00.000  1             0
4    5         2012-01-01 14:09:00.000   2012-01-01 17:30:00.000  1             0
1    4         2012-01-01 09:00:00.000   2012-01-01 19:30:00.000  3             150

更新

如果不清楚，MinutesOff 只是“允许”时间，或同一行中显示的 ClockedIn 和 EffectiveClockout 之间“吃掉”的时间量。因此，StaffID 5 在计时时间段之间休息了 129 分钟，但没有留出时间，因此两行的 MinutesOff 均为 0。

score 0 · Accepted Answer

选项 1：也许将其插入临时表，然后使用左连接来构建结果表（如果他们一天中只能打卡两次，如果你有 3 个结果，这将有效，但不会）

select *
from timesheet ts
left join timesheet tss on tss.id = ts.id

在此之后，您可以获得最小值和最大值，甚至可以获得更强大的报告。

选项 2：

create #TimeTable Table (UserID int, InTime int, OutTime int)

insert into #TimeTable (UserID) select distinct StaffID

Update #TimeTable set InTime = (select Min(InTime) from #TimeTable where StaffID = s.StaffID)  from #TimeTAble s

Update #TimeTable set OutTime = (Select Max(OutTime) from #TimeTable where StaffID = s.StaffID) from #TimeTable s

考虑到更多时间，我会将它们合并为两个快速查询，但三个可以用于不担心性能。

score 0 · Accepted Answer

基于迭代集的方法：

-- Sample data.
declare @TimesheetEntries as Table ( Id Int Identity, StaffId Int, ClockIn DateTime, ClockOut DateTime )
insert into @TimesheetEntries ( StaffId, ClockIn, ClockOut ) values
  ( 4, '2012-05-03 09:00', '2012-05-03 12:00' ),
  ( 4, '2012-05-03 13:30', '2012-05-03 17:30' ), -- This falls within 2 hours of the next two rows.
  ( 4, '2012-05-03 17:35', '2012-05-03 18:00' ),
  ( 4, '2012-05-03 19:00', '2012-05-03 19:30' ),
  ( 4, '2012-05-03 19:45', '2012-05-03 20:00' ),
  ( 5, '2012-05-03 09:00', '2012-05-03 12:00' ),
  ( 5, '2012-05-03 14:09', '2012-05-03 17:30' ),
  ( 6, '2012-05-03 09:00', '2012-05-03 12:00' ),
  ( 6, '2012-05-03 13:00', '2012-05-03 17:00' )
select Id, StaffId, ClockIn, ClockOut from @TimesheetEntries

-- Find all of the periods that need to be coalesced and start the process.
declare @Bar as Table ( Id Int Identity, StaffId Int, ClockIn DateTime, ClockOut DateTime )
insert into @Bar
  select TSl.StaffId, TSl.ClockIn, TSr.ClockOut
    from @TimesheetEntries as TSl inner join
      -- The same staff member and the end of the left period is within two hours of the start of the right period.
      @TimesheetEntries as TSr on TSr.StaffId = TSl.StaffId and DateDiff( ss, TSl.ClockOut, TSr.ClockIn ) between 0 and 7200

-- Continue coalescing periods until we run out of work.
declare @Changed as Bit = 1
while @Changed = 1
  begin
  set @Changed = 0
  -- Coalesce periods.
  update Bl
    -- Take the later   ClockOut   time from the two rows.
    set ClockOut = case when Br.ClockOut >= Bl.ClockOut then Br.ClockOut else Bl.ClockOut end
    from @Bar as Bl inner join
      @Bar as Br on Br.StaffId = Bl.StaffId and
        -- The left row started before the right and either the right period is completely contained in the left or the right period starts within two hours of the end of the left.
        Bl.ClockIn < Br.ClockIn and ( Br.ClockOut <= Bl.ClockOut or DateDiff( ss, Bl.ClockOut, Br.ClockIn ) < 7200 )
  if @@RowCount > 0
    set @Changed = 1
  -- Delete rows where one period is completely contained in another.
  delete Br
    from @Bar as Bl inner join
      @Bar as Br on Br.StaffId = Bl.StaffId and
        ( ( Bl.ClockIn < Br.ClockIn and Br.ClockOut <= Bl.ClockOut ) or ( Bl.ClockIn <= Br.ClockIn and Br.ClockOut < Bl.ClockOut ) )
  if @@RowCount > 0
    set @Changed = 1
  end

-- Return all of the coalesced periods ...
select StaffId, ClockIn, ClockOut, 'Coalesced Periods' as [Type]
  from @Bar
union all
-- ... and all of the independent periods.
select StaffId, ClockIn, ClockOut, 'Independent Period'
  from @TimesheetEntries as TS
  where not exists ( select 42 from @Bar where StaffId = TS.StaffId and ClockIn <= TS.ClockIn and TS.ClockOut <= ClockOut )
order by ClockIn, StaffId

我确信应该进行一些优化。

score 0 · Accepted Answer

我认为你可以很容易地做到这一点，只需左连接回到自身和一次性匹配。以下不是完整的实现，而是更多的概念证明：

create table #TimeSheetEntries 
    ( 
    ID int identity not null primary key, 
    StaffID int not null, 
    ClockedIn datetime not null, 
    ClockedOut datetime not null 
    ); 

insert into #TimeSheetEntries 
    ( 
    StaffID, 
    ClockedIn, 
    ClockedOut 
    ) 
select 
    4, 
    '2012-01-01 09:00:00', 
    '2012-01-01 12:00:00' 
union all select 
    4, 
    '2012-01-01 13:30:00', 
    '2012-01-01 17:30:00' 
union all select 
    5, 
    '2012-01-01 09:00:00', 
    '2012-01-01 12:00:00' 
union all select 
    5, 
    '2012-01-01 14:09:00', 
    '2012-01-01 17:30:00'
union all select
    4,
    '2012-01-01 18:30:00', 
    '2012-01-01 19:30:00'
union all select 4, '2012-01-01 18:30:00', '2012-01-01 19:30:00';


select * from #timesheetentries tse1
left outer join #timesheetentries tse2 on tse1.staffid = tse2.staffid 
  and tse2.id = 
  (
      select MAX(ID) 
      from #timesheetentries ts_max 
      where ts_max.id < tse1.id and tse1.staffid = ts_max.staffid
  )
  outer apply   
  (
  select DATEDIFF(minute, tse2.clockedout, tse1.clockedin) as BreakTime
  ) as breakCheck

where BreakTime > 120 or BreakTime < 0 or tse2.id is null

order by tse1.StaffID, tse1.ClockedIn


   GO
   drop table #timesheetentries
   GO

这里的想法是你有你的原始时间表表tse1，然后你left join对同一个时间表表做一个别名tse2和匹配的行，当staffID它是相同的并且tse2.ID是仍然小于的最高 ID 值tse1.ID。这显然是一种糟糕的形式 - 您可能希望用于此 ID 比较，按您的/值ROW_NUMBER()进行分区和排序，因为时间可能已按时间顺序输入。StaffIDClockedInClockedOut

此时，连接表中的一行现在包含来自当前时间表条目的时间数据，以及之前的时间数据。这意味着我们可以对连续时间条目的ClockedIn/值进行比较......并且使用，我们可以找出用户离开之前和最近的值之间的时间长度。我为此使用了 an 只是因为它使代码更清晰，但是您可能可以将其打包到子查询中。ClockedOutDATEDIFF()ClockedoutClockedInOUTER APPLY

一旦我们执行了DATEDIFF()，就很容易找到个人BreakTime不超过 120 分钟障碍的情况并删除这些时间表条目，只留下员工时间表的重要行以供您以后的报告中使用。

sql-server - 我可以在没有游标的 SQL 函数中执行此操作吗？

5 回答 5

Related

Reference