1

我对 SQL 相当陌生,正在解决一些实践问题。我有一个示例 Twitter 数据库,我试图根据关注者的数量在每个位置找到前 3 名用户。

这是我正在使用的表格:

id_follower_location

        id       | followers | location 
-----------------+-----------+----------
 id28929238      |         1 | Toronto
 id289292338     |         1 | California
 id2892923838    |         2 | Rome
 .
 .

locations

           location       
----------------------
 Bay Area, California
 London
 Nashville, TN
.
.

我已经能够通过以下方式找到“顶级”用户:

create view top1id as 
  select location, 
    (select id_followers_location.id from id_followers_location 
      where id_followers_location.location = locations.location 
      order by followers desc limit 1
    ) as id 
  from locations;

create view top1 as 
  select location, id, 
    (select followers from id_followers_location 
      where id_followers_location.id = top1id.id
    ) as followers 
  from top1id;

我能够想出解决这个问题的唯一方法是找出“Top 1st”、“Top 2nd”、“Top 3rd”,然后union将它们组合起来。这是正确/唯一的方法吗?或者,还有更好的方法?

4

2 回答 2

4

前 n

rank()您一起获得至少 3行(如果更少,则更少)。如果前 3 名之间存在平局,则可能会返回更多行。看:

如果您希望每个位置恰好有 3行(如果存在更少,则更少),您必须打破平局。一种方法是使用row_number()而不是rank().

SELECT *
FROM (
   SELECT id, location
        , row_number() OVER (PARTITION BY location ORDER BY followers DESC) AS rn
   FROM   id_follower_location
   ) r
WHERE  rn <= 3
ORDER  BY location, rn;

您可能希望添加ORDER BY到外部查询以保证排序输出。
如果有超过三个有效候选人,您会从平局中任意挑选 - 除非您ORDER BY在条款中添加更多项目OVER以打破平局。

前1名

至于获取前 1行的查询:PostgreSQL 中有一种简单、更快捷的方法:

SELECT DISTINCT ON (location)
       id, location           -- add additional columns freely
FROM   id_follower_location
ORDER  BY location, followers DESC;

此密切相关的答案中有关此查询技术的详细信息:

于 2013-04-14T13:56:34.610 回答
2

您可以使用窗口函数执行此操作:http ://www.postgresql.org/docs/9.1/static/tutorial-window.html

例如(未经测试可能需要轻微的语法修复):

SELECT follower_ranks.id, follower_ranks.location 
FROM (
    SELECT id, location, 
      RANK() OVER (PARTITION BY location ORDER BY followers DESC) 
    FROM id_follower_location
) follower_ranks 
WHERE follower_ranks.rank <= 3;
于 2013-04-14T03:09:39.697 回答