2

细节

我结合了以下表格

测试结果

--------------------------------------------------------------------
| index | uid|         start         |          stop        | score| 
--------------------------------------------------------------------
|   1   | 23 |   2012-06-06 07:30:20 | 2012-06-06 07:30:34  | 100  |
--------------------------------------------------------------------
|   2   | 34 |   2012-06-06 07:30:21 | 2012-06-06 07:30:40  | 100  |
--------------------------------------------------------------------

用户表

------------------------------
| id  |       username       |  
------------------------------
| 23  |    MacGyver’s mum    | 
------------------------------
| 34  |       Gribblet       | 
------------------------------

使用这个 sql

SELECT a.username, b.duration, b.score
FROM usertable AS a
JOIN    (SELECT `uid`, `score`,
TIMESTAMPDIFF( SECOND, start, stop ) AS `duration`
FROM `testresults`
WHERE `start` >= DATE(NOW())
ORDER BY `score` DESC, `duration` ASC
LIMIT 100) AS b
ON a.id = b.uid

问题是I want to rank the results。我认为在 sql 中执行此操作可能比在 php 中更容易/更快,因此基于http://code.openark.org/blog/mysql/sql-ranking-without-self-join这就是我尝试过的

SELECT a.username, b.duration, b.score, COUNT(DISTINCT b.duration, b.score) AS rank
FROM usertable AS a
JOIN    (SELECT `uid`, `score`,
TIMESTAMPDIFF( SECOND, start, stop ) AS `duration`
FROM `testresults`
WHERE `start` >= DATE(NOW())
ORDER BY `score` DESC, `duration` ASC
LIMIT 100) AS b
ON a.id = b.uid

但我没有回到预期的排名。它只返回一行。

问题

我究竟做错了什么?只有在持续时间和分数唯一的情况下,如何才能提高排名?

更新1

使用 bdenham 的“慢速方法”对我有用,但第二种方法没有。我真的不明白“快速方法”中发生了什么。我已经发布了我正在使用的数据和结果表。你会看到排名被搞砸了。

 -------------------------------------------------------------------
| index | uid|         start         |          stop        | score| 
--------------------------------------------------------------------
|   1   | 32 |  2012-08-27 05:47:18  |  2012-08-27 05:47:36 |  100 | 18s
|   2   | 32 |  2012-08-27 05:50:36  |  2012-08-27 05:50:42 |   0  |  6s
|   3   | 32 |  2012-08-27 05:51:18  |  2012-08-27 05:51:25 |  100 |  7s
|   4   | 32 |  2012-08-27 05:51:30  |  2012-08-27 05:51:35 |   0  |  5s
|   5   | 32 |  2012-08-27 05:51:39  |  2012-08-27 05:51:44 |   50 |  5s
--------------------------------------------------------------------

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| username | score | duration | @prevScore:=@currScore | @prevDuration:=@currDuration | @currScore:=r.score | @currDuration:=timestampdiff(second,r.start,r.stop) |rank |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   bob    |  100  |    7     |     [BLOB - 1B]        |         [BLOB - 1B]          |     100             |                                7                    |  3  |
|   bob    |  100  |    18    |     [BLOB - 0B]        |         [BLOB - 0B]          |     100             |                               18                    |  1  |
|   bob    |   50  |    5     |     [BLOB - 1B]        |         [BLOB - 1B]          |      50             |                                5                    |  5  |
|   bob    |   0   |    5     |     [BLOB - 3B]        |         [BLOB - 1B]          |       0             |                                5                    |  4  |
|   bob    |   0   |    6     |     [BLOB - 3B]        |         [BLOB - 2B]          |       0             |                                6                    |  2  |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
4

1 回答 1

4

您问题中链接中的两种方法都适用于 MySQL 5.5.25。这是SQL 小提琴。但是我无法使这些方法适应您稍微复杂的模型。您有一个额外的连接,而且您的排名基于两列而不是一列。

您的尝试不遵循任何一种方法,但我怀疑您正在尝试遵循缓慢的“传统”解决方案。正如其他人指出的那样,该解决方案需要您完全缺乏的自我加入和分组。

这是我尝试使慢速方法适应您的模型的失败尝试。问题是 MySQL 只保留为给定等级找到的最后一行的用户名。从结果中丢弃具有相同等级的较早行。该查询不会在大多数数据库上运行,因为 GROUP BY 不包括用户名。MySQL 对 GROUP BY 有非标准规则。我不明白为什么您的中等复杂模型不起作用,但简单的链接模型确实起作用。我认为无论如何缺少 GROUP BY 条款是一个坏主意。

select u.username,
       r1.score,
       timestampdiff(second,r1.start,r1.stop) duration,
       count( distinct concat(r2.score,',',timestampdiff(second,r2.start,r2.stop)) ) rank
  from testresults r1
  join testresults r2
    on r2.score>r1.score
     or( r2.score=r1.score
         and
         timestampdiff(second,r2.start,r2.stop)<=timestampdiff(second,r1.start,r1.stop)
       )
  join usertable u
    on u.id=r1.uid
 where r1.start>=date(now())
   and r2.start>=date(now())
 group by r1.score, duration
 order by score desc, duration asc limit 100

这是对慢速方法的修复。它首先计算每个唯一分数/持续时间对的排名,然后将该结果与每个测试结果连接起来。这行得通,但它甚至比原来的破坏方法还要慢。

select username,
       r.score,
       r.duration,
       r.rank
  from testresults tr
  join usertable u
    on u.id=tr.uid
  join (
          select r1.score,
                 timestampdiff(second,r1.start,r1.stop) duration,
                 count( distinct concat(r2.score,',',timestampdiff(second,r2.start,r2.stop)) ) rank
            from testresults r1
            join testresults r2
              on r2.score>r1.score
               or( r2.score=r1.score
                   and
                   timestampdiff(second,r2.start,r2.stop)<=timestampdiff(second,r1.start,r1.stop)
                 )
           where r1.start>=date(now())
             and r2.start>=date(now())
           group by r1.score, duration
       ) r
    on r.score=tr.score
   and r.duration=timestampdiff(second,tr.start,tr.stop)
 where tr.start>=date(now())
 order by rank limit 100

这是我将快速方法应用于您的模型的失败尝试。该方法不起作用,因为所选变量是在排序操作之前计算的。同样,我不明白为什么链接中的简单模型有效,但您的模型无效。

select u.username,
       r.score,
       timestampdiff(second,r.start,r.stop) duration,
       @prevScore:=@currScore,
       @prevDuration:=@currDuration,
       @currScore:=r.score,
       @currDuration:=timestampdiff(second,r.start,r.stop),
       @rank:=if(@prevScore=@currScore and @prevDuration=@currDuration, @rank, @rank+1) rank
  from testresults r
  join usertable u
    on u.id=r.uid
  cross join (select @currScore:=null, @currDuration:=null, @prevScore:=null, @prevDuration:=null, @rank:=0) init
 where r.start>=date(now())
 order by score desc, duration asc limit 100

这是快速方法的“固定”版本。但它依赖于子查询中排序行的顺序。一般来说,查询不应该依赖于行的顺序,除非有明确的 SORT 操作。外部查询未排序,即使是,我也不知道变量是在外部排序之前还是之后计算的。

select username,
       score,
       duration,
       @prevScore:=@currScore,
       @prevDuration:=@currDuration,
       @currScoure:=score,
       @currDuration:=duration,
       @rank:=if(@prevScore=score and @prevDuration=duration, @rank, @rank+1) rank
  from (
          select u.username,
                 r.score,
                 timestampdiff(second,r.start,r.stop) duration
            from testresults r
            join usertable u
              on u.id=r.uid
           where r.start>=date(now())
           order by score desc, duration asc limit 100
       ) scores,
       (
          select @currScore:=null, 
                 @currDuration:=null, 
                 @rank:=0
       ) init

我认为如果您只选择没有排名的结果,按分数和持续时间排序,您将获得同样好的表现。由于结果已经排序,您的 PHP 可以有效地计算排名。您的 PHP 可以将 rank 初始化为 0,并将 prev score 和 duration 初始化为 null。然后将每一行与之前的值进行比较,如果有差异则增加排名。让 PHP 对排序结果进行排名的最大优势是它应该始终有效,无论数据库引擎的品牌或版本如何。它仍然应该很快。

这是显示所有 4 个查询的SQL Fiddle 。我修改了 WHERE 子句,以便查询在任何日期都可以继续工作。

于 2012-08-26T15:29:27.230 回答