5

我有这张桌子:

create table t (value int, dt date);

 value |     dt     
-------+------------
    10 | 2012-10-30
    15 | 2012-10-29
  null | 2012-10-28
  null | 2012-10-27
     7 | 2012-10-26

我想要这个输出:

 value |     dt     
-------+------------
    10 | 2012-10-30
     5 | 2012-10-29
     5 | 2012-10-28
     5 | 2012-10-27
     7 | 2012-10-26

当表格按日期降序排序时,我希望将空值以及前一个非空值替换为前一个非空值的平均值。在此示例中,值 15 是接下来两个空值的前一个非空值。所以 15 / 3 = 5。

SQL小提琴

4

2 回答 2

4

我发现了一个非常简单的解决方案:

SELECT max(value) OVER (PARTITION BY grp)
      / count(*)  OVER (PARTITION BY grp) AS value
      ,dt
FROM   (
   SELECT *, count(value) OVER (ORDER BY dt DESC) AS grp
   FROM   t
   ) a;

-> sqlfiddle

由于count()忽略NULL值,您可以使用运行计数(窗口函数中的默认值)快速分组值(-> grp)。

每个组都有一个非空值,因此我们可以使用 min / max / sum 在另一个窗口函数中得到相同的结果。除以成员的数量(count(*)这次是计算NULL值!),grp我们就完成了。

于 2012-11-05T19:12:29.593 回答
1

作为一个谜,这是一个解决方案......实际上它可能会根据您的数据的性质而表现得非常糟糕。无论如何,请注意您的索引:

create database tmp;
create table t (value float, dt date); -- if you use int, you need to care about rounding
insert into t values (10, '2012-10-30'), (15, '2012-10-29'), (null, '2012-10-28'), (null, '2012-10-27'), (7, '2012-10-26');

select t1.dt, t1.value, t2.dt, t2.value, count(*) cnt 
from t t1, t t2, t t3 
where 
    t2.dt >= t1.dt and t2.value is not null 
    and not exists (
        select * 
        from t 
        where t.dt < t2.dt and t.dt >= t1.dt and t.value is not null
    ) 
    and t3.dt <= t2.dt 
    and not exists (
        select * 
        from t where t.dt >= t3.dt and t.dt < t2.dt and t.value is not null
    ) 
group by t1.dt;

+------------+-------+------------+-------+-----+
| dt         | value | dt         | value | cnt |
+------------+-------+------------+-------+-----+
| 2012-10-26 |     7 | 2012-10-26 |     7 |   1 |
| 2012-10-27 |  NULL | 2012-10-29 |    15 |   3 |
| 2012-10-28 |  NULL | 2012-10-29 |    15 |   3 |
| 2012-10-29 |    15 | 2012-10-29 |    15 |   3 |
| 2012-10-30 |    10 | 2012-10-30 |    10 |   1 |
+------------+-------+------------+-------+-----+
5 rows in set (0.00 sec)

select dt, value/cnt 
from (
    select t1.dt , t2.value, count(*) cnt 
    from t t1, t t2, t t3 
    where 
        t2.dt >= t1.dt and t2.value is not null 
        and not exists (
            select * 
            from t 
            where t.dt < t2.dt and t.dt >= t1.dt and t.value is not null
        ) 
    and t3.dt <= t2.dt 
    and not exists (
        select * 
        from t 
        where t.dt >= t3.dt and t.dt < t2.dt and t.value is not null
    ) 
    group by t1.dt
) x;

+------------+-----------+
| dt         | value/cnt |
+------------+-----------+
| 2012-10-26 |         7 |
| 2012-10-27 |         5 |
| 2012-10-28 |         5 |
| 2012-10-29 |         5 |
| 2012-10-30 |        10 |
+------------+-----------+
5 rows in set (0.00 sec)

解释:

  • t1 是原始表
  • t2 是表中日期最小且具有非空值的行
  • t3 都是介于两者之间的行,因此我们可以按其他行分组并计数

对不起,我不能更清楚。这对我来说也很困惑:-)

于 2012-11-05T18:34:14.670 回答