5

我有一个包含值和组 ID 的表(简化示例)。我需要得到中间 3 个值的每组的平均值。因此,如果有 1、2 或 3 个值,它只是平均值。但是如果有 4 个值,它将排除最高值,最高和最低的 5 个值,等等。我在想某种窗口函数,但我不确定它是否可能。

http://www.sqlfiddle.com/#!11/af5e0/1

对于此数据:

TEST_ID TEST_VALUE  GROUP_ID
1       5           1
2       10          1
3       15          1
4       25          2
5       35          2
6       5           2
7       15          2
8       25          3
9       45          3
10      55          3
11      15          3
12      5           3
13      25          3
14      45          4

我想要

GROUP_ID    AVG
1           10
2           15
3           21.6
4           45
4

5 回答 5

6

使用分析函数的另一种选择;

SELECT group_id,
       avg( test_value )
FROM (
  select t.*,
         row_number() over (partition by group_id order by test_value ) rn,
         count(*) over (partition by group_id  ) cnt
  from test t
) alias 
where 
   cnt <= 3
   or 
   rn between floor( cnt / 2 )-1 and ceil( cnt/ 2 ) +1
group by group_id
;

演示 --> http://www.sqlfiddle.com/#!11/af5e0/59

于 2013-09-30T18:39:12.533 回答
2
with cte as (
    select
        *,
        row_number() over(partition by group_id order by test_value) as rn,
        count(*) over(partition by group_id) as cnt
    from test
)
select
    group_id, avg(test_value)
from cte
where
    cnt <= 3 or
    (rn >= cnt / 2 - 1 and rn <= cnt / 2 + 1)
group by group_id
order by group_id

sql fiddle demo

在 cte 中,我们需要group_id通过窗口函数+ 计算每个内部的 row_number来获取每个元素的数量group_id。然后,如果这个计数 > 3,那么我们需要通过将计数除以 2 来获得组的中间,然后获得 +1 和 -1 元素。如果 count <= 3,那么我们应该只取所有元素。

于 2013-09-30T18:46:21.080 回答
2

我不熟悉窗口函数上的 Postgres 语法,但我能够用这个SQL Fiddle解决你在 SQL Server 中的问题。也许您可以轻松地将其迁移到与 Postgres 兼容的代码中。希望能帮助到你!

关于我如何工作的快速入门。

  1. 排序每组的考试成绩
  2. 获取每个组中的项目数
  3. 将其用作子查询并仅选择中间 3 项(即外部查询中的 where 子句)
  4. 获取每组的平均值

--

select  
  group_id,
  avg(test_value)
from (
  select 
    t.group_id, 
    convert(decimal,t.test_value) as test_value, 
    row_number() over (
      partition by t.group_id
      order by t.test_value
    ) as ord,
    g.gc
  from
    test t
    inner join (
      select group_id, count(*) as gc
      from test
      group by group_id
    ) g
      on t.group_id = g.group_id
  ) a
where
  ord >= case when gc <= 3 then 1 when gc % 2 = 1 then gc / 2 else (gc - 1) / 2 end
  and ord <= case when gc <= 3 then 3 when gc % 2 = 1 then (gc / 2) + 2 else ((gc - 1) / 2) + 2 end
group by
  group_id
于 2013-09-30T18:25:41.097 回答
1

这有效:

SELECT A.group_id, avg(A.test_value) AS avg_mid3 FROM
  (SELECT group_id,
         test_value,
         row_number() OVER (PARTITION BY group_id ORDER BY test_value) AS position
      FROM test) A
JOIN
  (SELECT group_id,
         CASE
           WHEN count(*) < 4 THEN 1
           WHEN count(*) % 2 = 0 THEN (count(*)/2 - 1)
           ELSE (count(*) / 2)
         END AS position_start,
         CASE
           WHEN count(*) < 4 THEN count(*)
           WHEN count(*) % 2 = 0 THEN (count(*)/2 + 1)
           ELSE (count(*) / 2 + 2)
         END AS position_end
         FROM test GROUP BY group_id) B
  ON A.group_id=B.group_id 
  AND A.position >= B.position_start 
  AND A.position <= B.position_end
GROUP BY A.group_id

小提琴链接:http ://www.sqlfiddle.com/#!11/af5e0/56

于 2013-09-30T18:31:16.803 回答
0

如果您需要计算组的平均值,那么您可以这样做:

SELECT CASE WHEN NUMBER_FIRST_GROUP <> 0 
               THEN SUM_FIRST_GROUP / NUMBER_FIRST_GROUP 
               ELSE NULL
       END AS AVG_FIRST_GROUP,
       CASE WHEN NUMBER_SECOND_GROUP <> 0 
               THEN SUM_SECOND_GROUP / NUMBER_SECOND_GROUP 
               ELSE NULL
       END AS AVG_SECOND_GROUP,
       CASE WHEN NUMBER_THIRD_GROUP <> 0 
               THEN SUM_THIRD_GROUP / NUMBER_THIRD_GROUP 
               ELSE NULL
       END AS AVG_THIRD_GROUP,
       CASE WHEN NUMBER_FOURTH_GROUP <> 0 
               THEN SUM_FOURTH_GROUP / NUMBER_FOURTH_GROUP 
               ELSE NULL
       END AS AVG_FOURTH_GROUP
FROM (
      SELECT 
         SUM(CASE WHEN GROUP_ID = 1 THEN 1 ELSE 0 END) AS NUMBER_FIRST_GROUP,
         SUM(CASE WHEN GROUP_ID = 1 THEN TEST_VALUE ELSE 0 END) AS SUM_FIRST_GROUP,
         SUM(CASE WHEN GROUP_ID = 2 THEN 1 ELSE 0 END) AS NUMBER_SECOND_GROUP,
         SUM(CASE WHEN GROUP_ID = 2 THEN TEST_VALUE ELSE 0 END) AS SUM_SECOND_GROUP,
         SUM(CASE WHEN GROUP_ID = 3 THEN 1 ELSE 0 END) AS NUMBER_THIRD_GROUP,
         SUM(CASE WHEN GROUP_ID = 3 THEN TEST_VALUE ELSE 0 END) AS SUM_THIRD_GROUP,
         SUM(CASE WHEN GROUP_ID = 4 THEN 1 ELSE 0 END) AS NUMBER_FOURTH_GROUP,
         SUM(CASE WHEN GROUP_ID = 4 THEN TEST_VALUE ELSE 0 END) AS SUM_FOURTH_GROUP
     FROM TEST
     ) AS FOO
于 2013-09-30T18:18:57.423 回答