0

我有一张桌子:

ID CLUSTERID
1     56
1     24
1     24
1      35
2      13
2      24

现在,我想得到以下信息:我想计算每个 id,哪个集群 id 大部分时间重复。例如,在 ID=1 中,CLUSTERID=24 大部分时间重复在 ID=2 中,我有 2 个重复相同的 CLUSTER ID。所以在输出中我会有:

ID CLUSTERID
1   24
2   13
2  24

我写(和工作)TT 的答案
是我的原始表,它有 2 列:ID 和 CLUSTER ID

SELECT t3.ID,t3.ClusterID,t3.ListingAmount
FROM
(SELECT ID, ClusterID, COUNT( ) AS ListingAmount
FROM tt
GROUP BY ID, ClusterID) AS t3 LEFT JOIN
(SELECT ID, MAX(ListingAmount) AS amount
FROM
(SELECT ID , ClusterID, COUNT(
) AS ListingAmount
FROM tt
GROUP BY ID, ClusterID) AS t2
GROUP BY ID) AS BB ON BB.id=t3.id
WHERE BB.amount=t3.ListingAmount

4

1 回答 1

0

现在想不出更优雅的解决方案(我敢肯定有),但它似乎可以完成这项工作:

select t1.id, 
       t1.clusterid,
       t1.cnt
from (
    select id, 
           clusterid, 
           count(*) as cnt
    from foo
    group by id, clusterid
) t1 
  join (select id, 
                max(cnt) as max_count
        from (       
            select id, 
                   clusterid, 
                   count(*) as cnt
            from foo
            group by id, clusterid
        ) tm
        group by id
  ) t2 on t1.id = t2.id 
      and t1.cnt = t2.max_count
order by t1.id, t1.cnt;

SQLFiddle 示例:http ://sqlfiddle.com/#!2/2cacc/3

于 2013-04-10T09:58:22.437 回答