sql - SQL： Last_Value() 返回错误的结果（但 First_Value() 工作正常）

Question

如快照所示，我在 SQL Server 2012 中有一个表：

在此处输入图像描述

然后我使用 Last_Value() 和 First Value 来获取不同 YearMonth 的每个 EmpID 的 AverageAmount。脚本如下：

SELECT A.EmpID,  
       First_Value(A.AverageAmount) OVER (PARTITION BY A.EmpID Order by A.DimYearMonthKey asc) AS  '200901AvgAmount', 
       Last_Value(A.AverageAmount) OVER (PARTITION BY A.EmpID Order by A.DimYearMonthKey asc) AS '201112AvgAmount'

FROM  Emp_Amt  AS A

但是，此查询的结果是：

在“201112AvgAmount”列中，每个 EmpID 显示不同的值，而“200901AvgAmount”具有正确的值。

我的 SQL 脚本有什么问题吗？我在网上做了很多研究，但仍然找不到答案......

score 21 · Accepted Answer

这是一个快速查询来说明行为：

select 
  v,

  -- FIRST_VALUE() and LAST_VALUE()
  first_value(v) over(order by v) f1,
  first_value(v) over(order by v rows between unbounded preceding and current row) f2,
  first_value(v) over(order by v rows between unbounded preceding and unbounded following) f3,
  last_value (v) over(order by v) l1,
  last_value (v) over(order by v rows between unbounded preceding and current row) l2,
  last_value (v) over(order by v rows between unbounded preceding and unbounded following) l3,

  -- For completeness' sake, let's also compare the above with MAX()
  max        (v) over() m1,
  max        (v) over(order by v) m2,
  max        (v) over(order by v rows between unbounded preceding and current row) m3,
  max        (v) over(order by v rows between unbounded preceding and unbounded following) m4
from (values(1),(2),(3),(4)) t(v)

可以在这里看到上述查询的输出（此处为SQLFiddle）：

| V | F1 | F2 | F3 | L1 | L2 | L3 | M1 | M2 | M3 | M4 |
|---|----|----|----|----|----|----|----|----|----|----|
| 1 |  1 |  1 |  1 |  1 |  1 |  4 |  4 |  1 |  1 |  4 |
| 2 |  1 |  1 |  1 |  2 |  2 |  4 |  4 |  2 |  2 |  4 |
| 3 |  1 |  1 |  1 |  3 |  3 |  4 |  4 |  3 |  3 |  4 |
| 4 |  1 |  1 |  1 |  4 |  4 |  4 |  4 |  4 |  4 |  4 |

很少有人会想到应用于带有ORDER BY子句的窗口函数的隐式框架。在这种情况下，窗口默认为 frame RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW。（RANGE 与 ROWS 并不完全相同，但那是另一回事了）。这样想：

在v = 1有序窗口的框架跨度的行上v IN (1)
在v = 2有序窗口的框架跨度的行上v IN (1, 2)
在v = 3有序窗口的框架跨度的行上v IN (1, 2, 3)
在v = 4有序窗口的框架跨度的行上v IN (1, 2, 3, 4)

如果你想阻止这种行为，你有两个选择：

对有序窗口函数使用显式ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING子句
ORDER BY在那些允许省略它们的窗口函数中使用 no子句（as MAX(v) OVER()）

更多细节在这篇文章中解释了关于LEAD(), LAG(),FIRST_VALUE()和LAST_VALUE()

score 15 · Accepted Answer

您的脚本没有任何问题，这是分区在 SQL Server 中的工作方式：/。如果您将 LAST_VALUE 更改为 MAX 结果将是相同的。解决方案是：

SELECT A.EmpID,  
       First_Value(A.AverageAmount) OVER (PARTITION BY A.EmpID Order by A.DimYearMonthKey asc) AS  '200901AvgAmount', 
       Last_Value(A.AverageAmount) OVER (PARTITION BY A.EmpID Order by A.DimYearMonthKey ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS '201112AvgAmount'  
FROM  Emp_Amt  AS A

有一个很棒的帖子，链接。GL！

score 0 · Accepted Answer

最简单的方法是使用 first_value 重复查询，只需将第一种情况的顺序设为 asc，第二种情况的顺序为 desc。

SELECT A.EmpID,  
       First_Value(A.AverageAmount) OVER (PARTITION BY A.EmpID Order by A.DimYearMonthKey asc) AS  '200901AvgAmount', 
       First_Value(A.AverageAmount) OVER (PARTITION BY A.EmpID Order by A.DimYearMonthKey desc) AS '201112AvgAmount'

FROM  Emp_Amt  AS A

sql - SQL： Last_Value() 返回错误的结果（但 First_Value() 工作正常）

3 回答 3

Related

Reference