4

我有一个包含很多记录的表,其中一些字段重复。我想要每个重复中最常见的。

所以,如果我的表有如下数据:

 ID     Field1     Field2  
  1      A          10  
  2      A          12 
  3      B          5  
  4      A          10  
  5      B          5  
  6      A          10  
  7      B          8
  8      B          5
  9      A          10

我可以选择不同的并获得计数:

select distinct Field1, Field2, count(Field1)
from Table
group by Field1, Field2
order by Field1, count(Field1) desc

那会给我

Field1    Field2     Count
A         10         4
A         12         1
B          5         3
B          8         1

但是,我只想要每个 Field1 中计数最多的记录。我一直在与 rank() 就分区和子查询进行斗争,但还没有找到正确的语法来使用两个字段来实现唯一性并按计数选择最高记录。我一直在寻找,我确定有人问过这个问题,但我找不到。

我想得到以下

Field1     Field2       (optional) Count 
 A          10           4
 B           5           3

目标是查看只有一点点不正确数据的表(字段 1 和字段 2 之间的链接错误),并根据通常的情况确定它应该是什么。我不知道有多少不良记录,因此消除低于某个阈值的 Count 会起作用,但似乎有点笨拙。

如果更好,我可以制作一个临时表来将我的不同值放入其中,然后从中进行选择,但似乎没有必要这样做。

4

3 回答 3

6

我想这就是你要找的:

select field1, field2, cnt from 
(select field1, field2, cnt, rank() over (partition by field1 order by cnt desc) rnk
from (select distinct Field1, Field2, count(Field1) cnt
            from Table1
            group by Field1, Field2
            order by Field1, count(Field1) desc) 
)
where rnk = 1;

SQL小提琴:http ://sqlfiddle.com/#!4/fe96d/3

于 2012-12-04T18:41:05.533 回答
2

由于多层嵌套子查询,这有点不雅。但是,它应该是相当有效的。遵循 SQL 中的步骤应该相当容易

SQL> ed
Wrote file afiedt.buf

  1  with x as (
  2    select 1 id, 'A' field1, 10 field2 from dual union all
  3    select 2, 'A', 12 from dual union all
  4    select 3, 'B', 5 from dual union all
  5    select 4, 'A', 10 from dual union all
  6    select 5, 'B', 5 from dual union all
  7    select 6, 'A', 10 from dual union all
  8    select 7, 'B', 8 from dual union all
  9    select 8, 'B', 5 from dual union all
 10    select 9, 'A', 10 from dual
 11  )
 12  select field1,
 13         field2,
 14         cnt
 15    from (select field1,
 16                 field2,
 17                 cnt,
 18                 rank() over (partition by field1
 19                                  order by cnt desc) rnk
 20           from (select field1, field2, count(*) cnt
 21                   from x
 22                  group by field1, field2))
 23*  where rnk = 1
SQL> /

F     FIELD2        CNT
- ---------- ----------
A         10          4
B          5          3
于 2012-12-04T18:30:50.060 回答
2

第三种方法;)

select field1,
       field2,
       max_cnt
from (
  select field1, 
         field2, 
         cnt,
         max(cnt) over (partition by field1, field2) as max_cnt,
         row_number() over (partition by field1 order by cnt desc) as rn
  from (
      select field1, 
             field2, 
             count(*) over (partition by Field1, Field2) as cnt
      from idlist
  ) t1 
) t2
where max_cnt = cnt 
and rn = 1

SQLFiddle:http ://sqlfiddle.com/#!4/8461f/1

于 2012-12-04T18:44:47.487 回答