5

我想在下表中使用 distinct,但仅限于“PlayerID”列。这就是我目前所拥有的:

   MATCHID   PLAYERID     TEAMID MATCHDATE STARTDATE
---------- ---------- ---------- --------- ---------
        20          5          2 14-JAN-12 01-JUN-11
        20          5          4 14-JAN-12 01-JUN-10
        20          7          4 14-JAN-12 01-JUN-11
        20          7          2 14-JAN-12 01-JUN-10
        20         10          4 14-JAN-12 01-JUN-11
        20         11          2 14-JAN-12 01-JUN-10
        20         13          2 14-JAN-12 01-JUN-11
        20         16          4 14-JAN-12 01-JUN-10
        20         17          4 14-JAN-12 01-JUN-10
        20         18          4 14-JAN-12 01-JUN-10
        20         19          2 14-JAN-12 01-JUN-11

这就是我想要的,以便显示每个“PlayerID”的最高“StartDate”并忽略下一行:

   MATCHID   PLAYERID     TEAMID MATCHDATE STARTDATE
---------- ---------- ---------- --------- ---------
        20          5          2 14-JAN-12 01-JUN-11
        20          7          4 14-JAN-12 01-JUN-11
        20         10          4 14-JAN-12 01-JUN-11
        20         11          2 14-JAN-12 01-JUN-10
        20         13          2 14-JAN-12 01-JUN-11
        20         16          4 14-JAN-12 01-JUN-10
        20         17          4 14-JAN-12 01-JUN-10
        20         18          4 14-JAN-12 01-JUN-10
        20         19          2 14-JAN-12 01-JUN-11

当前 SQL:

SELECT pi.MatchID, pi.PlayerID, t.TeamID, m.MatchDate, pf.StartDate
FROM Plays_In pi, Match m, Plays_A pa, Team t, Plays_For pf, Made_Up_Of muo, Season s
WHERE pi.MatchID = m.MatchID
AND m.MatchID = pa.MatchID
AND pa.TeamID = t.TeamID
AND pf.PlayerID = pi.PlayerID
AND pf.TeamID = t.TeamID
AND muo.MatchID = pi.MatchID
AND muo.SeasonID = s.SeasonID
AND pi.MatchID = '&match_id'
AND m.MatchDate >= pf.StartDate
ORDER BY pi.MatchID ASC, pi.PlayerID ASC, pf.StartDate DESC;

它是一个 Oracle 数据库。

提前致谢。

4

3 回答 3

8

几点...

  • 除非您使用连接来Made_Up_Of过滤Season行,否则您不需要这些表。我把它们留在了这里;如果需要,您可以重新添加它们。

  • Mark Tickner 是正确的,您应该使用 ANSI JOIN 语法。它的好处(除了标准之外)是它将连接逻辑与正在连接的表正确放置。一旦你习惯了它,我想你会发现它更可取。

  • 您真正追求的是pf.StartDateeach的最大值PlayerID,这非常适合分析ROW_NUMBER()函数。PARTITION BY pi.PlayerID ORDER BY pf.StartDate DESC基本上会将值分配给1每个玩家最近排序日期的行。外部过滤掉除具有1排名的行之外的所有行。

  • 您还可以使用RANK()DENSE_RANK()分析功能分配排名,但如果玩家在最近的日期有平局,那么所有平局的日期都将排名第一,您将获得该玩家的多行。在这种情况下,您只希望每个玩家有一排,请ROW_NUMBER()改用。

把它们放在一起,你会得到这个:

SELECT MatchID, PlayerID, TeamID, MatchDte, StartDate FROM (
  SELECT
    pi.MatchID,
    pi.PlayerID,
    t.TeamID,
    m.MatchDate,
    pf.StartDate,
    ROW_NUMBER() OVER (PARTITION BY pi.PlayerID ORDER BY pf.StartDate DESC) AS StartDateRank
  FROM Plays_In pi
  INNER JOIN Match m ON pi.MatchID = m.MatchID
  INNER JOIN Plays_A pa ON m.MatchID = pa.MatchID
  INNER JOIN Team t ON pa.TeamID = t.TeamID
  INNER JOIN Plays_For pf ON pf.PlayerID = pi.PlayerID AND pf.TeamID = t.TeamID
  WHERE pi.MatchID = '&match_id'
  AND m.MatchDate >= pf.StartDate
)
WHERE StartDateRank = 1
ORDER BY MatchID, PlayerID

最后一点:基于WHERE pi.MatchID = '&match_id'它看起来您可能正在使用 PHP 作为您的前端和mysql执行查询的函数。如果是这样,请查看mysqliPDO改为,因为它们会保护您免受 SQL 注入。这些mysql功能(已正式弃用)不会。


附录:关于 的更多信息ROW_NUMBER,非常感谢@AndriyM。

使用ROW_NUMBER时,如果玩家有多个最近日期的行,则仅将其中一行指定为ROW_NUMBER = 1,并且该行将或多或少随机选择。这是一个示例,其中玩家的最近日期是 2013 年 5 月 1 日,并且该玩家有三行包含此日期:

pi.MatchID  pi.PlayerID  pf.StartDate
----------  -----------  ------------
      100         1000   05/01/2013 <-- could be ROW_NUMBER = 1
      101         1000   04/29/2013
      105         1000   05/01/2013 <-- could be ROW_NUMBER = 1
      102         1000   05/01/2013 <-- could be ROW_NUMBER = 1 
      107         1000   04/18/2013

请注意,只会分配上面的一行ROW_NUMBER = 1并且可以是其中的任何一行。甲骨文将做出决定,而不是您。

如果这种不确定性是一个问题,请按附加列排序以获得明显的赢家。对于此示例,pi.MatchID将使用最高值来确定“真” ROW_NUMBER = 1

-- replace `ROW_NUMBER...` in the query above with this:
    ROW_NUMBER() OVER (
      PARTITION BY pi.PlayerID
      ORDER BY pf.StartDate DESC, pi.MatchID DESC) AS StartDateRank

现在,如果最高的存在平局pf.StartDate,Oracle 会在最高pi.MatchID 的行子集中查找最高的pf.StartDate。事实证明,只有一行满足这个条件:

pi.MatchID  pi.PlayerID  pf.StartDate
----------  -----------  ------------
      100         1000   05/01/2013
      101         1000   04/29/2013
      105         1000   05/01/2013 <-- is ROW_NUMBER = 1: highest MatchID for
                                     -- most recent StartDate (5/1/2013)
      102         1000   05/01/2013
      107         1000   04/18/2013 <-- not considered: has the highest MatchID but isn't
                                     -- in the subset with the most recent StartDate
于 2013-05-03T02:31:40.283 回答
2

您可以使用 rank() 函数。

SELECT * FROM (
    SELECT pi.MatchID, pi.PlayerID, t.TeamID, m.MatchDate, pf.StartDate,
     rank() over (partition by pi.PlayerID order by m.MatchDate desc, rowid) as RNK
    FROM Plays_In pi, Match m, Plays_A pa, Team t, Plays_For pf, Made_Up_Of muo, Season s
    WHERE pi.MatchID = m.MatchID
    AND m.MatchID = pa.MatchID
    AND pa.TeamID = t.TeamID
    AND pf.PlayerID = pi.PlayerID
    AND pf.TeamID = t.TeamID
    AND muo.MatchID = pi.MatchID
    AND muo.SeasonID = s.SeasonID
    AND pi.MatchID = '&match_id'
    AND m.MatchDate >= pf.StartDate
) WHERE RNK = 1
ORDER BY MatchID ASC, PlayerID ASC, StartDate DESC;
于 2013-05-03T02:16:45.210 回答
0

也许使用INTERSECT然后找出MAX(StartDate)使用GROUP BY

于 2013-05-03T02:10:43.533 回答