3

我有以下两个表诊断和锻炼我想提取最接近诊断日期的锻炼日期,它应该是锻炼表的 1 行。

我已经尝试在 where 条件下使用 DATEDIFF 函数左连接

SELECT D.ID,D.Diagnose_Date,D.Type1,D.Type2,E.Exercise_Date],E.Field1,E.Field2,E.Field3
FROM Diagnose D
LEFT JOIN Exercise E
ON D.ID=E.ID
WHERE DATEDIFF(DAY,[Diagnose_Date],[Exercise_Date]) BETWEEN -30 AND 30

任何帮助都会非常有帮助

提前致谢


诊断表

------------------------------------------
ID     Dignose_Date     Type1    SubType1    
------------------------------------------
1      10/01/2010       01       1.1
2      20/02/2012       02       2.2
3      30/03/2013       01       1.2
------------------------------------------

运动表

------------------------------------------
ID     Exercise_Date  Field1  Field2  Field3
------------------------------------------
1      01/01/2010        x       y      z
2      10/02/2012        a       b      c
2      01/04/2012        e       f      f
3      01/03/2013        x       y      z
3      05/04/2013        a       b      c
3      01/06/2013        x       y      z
------------------------------------------

预期结果应该是:

------------------------------------------------------------------------
ID  Diagnose_Date  Exercise_Date Type1 SubType2  Field1  Field2  Field3
------------------------------------------------------------------------
1   10/01/2010     01/01/2010     01    1.1         x       y        z
2   20/02/2012     10/02/2012     02    2.2         a       b        c
3   30/03/2013     05/04/2013     01    1.2         a       b        c
-------------------------------------------------------------------------
4

4 回答 4

2

首先,在 CTE 中,对于每个诊断,获取诊断日期和与该诊断相关的所有锻炼日期之间的最小时间间隔。

WITH MIN_DATES_CTE(ID, DATE_DIFF)
AS (
    SELECT ID, MIN(ABS(DATEDIFF(DAY,[Diagnose_Date],[Exercise_Date])))
    FROM Exercise E
    INNER JOIN Diagnose D ON D.ID = E.ID
    GROUP BY E.ID
)

然后,通过ID和最小时间间隔加入Diagnose and Exercise

SELECT D.ID,D.Diagnose_Date,D.Type1,D.Type2,E.Exercise_Date],E.Field1,E.Field2,E.Field3
FROM Diagnose D
LEFT JOIN Exercise E ON D.ID = E.ID
INNER JOIN MIN_DATES_CTE ON MIN_DATES_CTE.ID = E.ID
WHERE ABS(DATEDIFF(DAY,[Diagnose_Date],[Exercise_Date])) = MIN_DATES_CTE.DATE_DIFF
于 2013-11-07T14:02:35.737 回答
1

我假设您只是根据彼此最接近的日期将任何单个诊断条目与任何单个锻炼条目进行匹配。

这是我的思路:
做一个完整JOIN的诊断和练习,按绝对日期差排序,升序。

SELECT
    D.ID,
    D.Date,
    E.ID,
    E.Date,
    ABS(DATEDIFF(day, D.Date, E.Date)) Diff

FROM Diagnosis D, Exercise E
ORDER BY Diff

你会得到这样的结果:

ID  Date        ID  Date        Diff
3   2013-03-30  5   2013-03-25  5
2   2012-02-20  2   2012-02-10  10
3   2013-03-30  4   2013-03-01  29
2   2012-02-20  3   2012-04-01  41
3   2013-03-30  6   2013-06-01  63
1   2010-10-01  1   2010-01-01  273
3   2013-03-30  3   2012-04-01  363
2   2012-02-20  4   2013-03-01  375
2   2012-02-20  5   2013-03-25  399
3   2013-03-30  2   2012-02-10  414
2   2012-02-20  6   2013-06-01  467
1   2010-10-01  2   2012-02-10  497
1   2010-10-01  3   2012-04-01  548
2   2012-02-20  1   2010-01-01  780
1   2010-10-01  4   2013-03-01  882
1   2010-10-01  5   2013-03-25  906
1   2010-10-01  6   2013-06-01  974
3   2013-03-30  1   2010-01-01  1184

现在您可以看到彼此最接近的日期,以及它们相距远的天数。

当然,你不会使用这个,但是从这个列表中,你可以选择第一个:

SELECT TOP 1
    D.ID,
    D.Date,
    E.ID,
    E.Date,
    ABS(DATEDIFF(day, D.Date, E.Date)) Diff

FROM Diagnosis D, Exercise E
ORDER BY Diff

现在您可以将此语句插入到联接中LEFT,这样您就可以单独选择与另一个匹配的日期。
像这样:

SELECT
    fD.ID,
    fD.Date,
    fE.ID,
    fE.Date
FROM
    Diagnosis fD
    LEFT JOIN Exercise fE
        ON fE.ID = (SELECT TOP 1 E.ID
                        FROM Diagnosis D, Exercise E
                        WHERE D.ID = fD.ID
                        ORDER BY ABS(DATEDIFF(day, D.Date, E.Date)))

这给出了结果:

ID  Date        ID  Date
1   2010-10-01  1   2010-01-01
2   2012-02-20  2   2012-02-10
3   2013-03-30  5   2013-03-25
于 2013-11-07T13:53:55.540 回答
1

您可以使用外部应用

SELECT  d.ID, 
        d.Diagnose_Date, 
        d.Type1, 
        d.SubType1, 
        e.Exercise_Date, 
        e.Field1, 
        e.Field2, 
        e.Field3
FROM    Diagnose d
        OUTER APPLY
        (   SELECT  TOP 1 Exercise_Date, Field1, Field2, Field3
            FROM    Exercise e
            WHERE   d.ID = e.ID
            AND     DATEDIFF(DAY, d.[Diagnose_Date], e.[Exercise_Date]) BETWEEN -30 AND 30
            ORDER BY ABS(DATEDIFF(DAY, d.[Diagnose_Date], e.[Exercise_Date])) 
        ) e;

SQL Fiddle 示例

我对此进行了更多测试,发现使用方法ROW_NUMBER()是最有效的:

WITH CTE AS
(   SELECT  d.ID,
            d.Diagnose_Date,
            d.Type1,
            d.SubType1, 
            e.Exercise_Date,
            e.Field1,
            e.Field2,
            e.Field3,
            RowNumber = ROW_NUMBER() OVER (PARTITION BY d.ID ORDER BY ABS(DATEDIFF(DAY,[Diagnose_Date],[Exercise_Date])))
    FROM    Diagnose D
            LEFT JOIN Exercise E 
                ON D.ID = E.ID
)
SELECT  ID,
        Diagnose_Date,
        Type1,
        SubType1, 
        EID = ID,
        Exercise_Date,
        Field1,
        Field2,
        Field3
FROM    CTE
WHERE   RowNumber = 1;

我已经将此与我的第一个解决方案和最赞成比较的答案进行了比较。结果如下:

外用

Cost relative to batch: 34%
--------------------------------------------------
Table 'Exercise'. Scan count 3, logical reads 3
Table 'Diagnose'. Scan count 1, logical reads 1
--------------------------------------------------
Total. Scan count 4, logical reads 4

SELF JOIN WITH AGGREGATES(迄今为止最高票数)

Cost relative to batch: 51%
--------------------------------------------------
Table 'Worktable'. Scan count 0, logical reads 0
Table 'Exercise'. Scan count 2, logical reads 4
Table 'Diagnose'. Scan count 2, logical reads 2
--------------------------------------------------
Total. Scan count 4, logical reads 6

ROW_NUMBER()

Cost relative to batch: 15%
--------------------------------------------------
Table 'Exercise'. Scan count 1, logical reads 3
Table 'Diagnose'. Scan count 1, logical reads 1
--------------------------------------------------
Total. Scan count 2, logical reads 4

SQL Fiddle 上的示例

因此该ROW_NUMBER解决方案具有最低的 IO 统计信息,以及最低的估计成本

于 2013-11-07T13:55:01.250 回答
0

仅使用标准 SQL:

SELECT D.ID, D.Diagnose_Date, D.Type1, D.SubType1, E.Exercise_Date, E.Field1, E.Field2, E.Field3
FROM Diagnose D
LEFT JOIN Exercise E
ON E.ID=D.ID AND
   E.Exercise_Date=(SELECT MAX(Exercise_Date) FROM Exercise WHERE Exercise.ID=D.ID AND Exercise.Exercise_Date<=D.Diagnose_Date)
于 2013-11-07T14:03:42.500 回答