也许您应该使用 TDQD — 测试驱动的查询设计。
Q1:哪些导演与哪些演员合作
第一步是确定哪些导演与哪些演员合作过电影,这可以通过连接 Movie_ID 列上的两个表来找到:
SELECT d.Movie_ID, d.Director_ID, A.Actor_ID
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
由于您没有告诉我们表的主键,我们无法判断单个演员是否可以被记录为在单个电影中具有多个不同的角色。我将假设 Roles 表的主键是组合 (Movie_ID, Actor_ID)。
Q2:每位导演与每位演员合作了多少次
我们需要根据上面的查询计算每个演员和导演组合的行数:
SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS num_joint_movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
Q3:对于每位导演,他们与演员合作的最大次数是多少?
我们现在需要从上面的结果中找到每个导演的最大联合电影数量。这就需要把上面的查询当成表,像这样:
SELECT n.Director_ID, MAX(n.num_joint_movies) AS max_joint_movies
FROM (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS num_joint_movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
) AS n
GROUP BY n.Director_ID
Q4:与每位导演合作最多的演员
现在我们需要结合查询 Q2 和 Q3 来获取参与者:
SELECT q3.Director_ID, q2.Actor_ID
FROM (SELECT n.Director_ID, MAX(n.Num_Joint_Movies) AS Max_Joint_Movies
FROM (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
) AS n
GROUP BY n.Director_ID
) AS q3
JOIN (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
) AS q2
ON q3.Director_ID = q2.Director_ID AND q3.Max_Joint_Movies = q2.Num_Joint_Movies
Q5:使用公用表表达式(CTE)
SQL 标准允许使用 WITH 子句引入的公用表表达式来简化查询,如下所示:
WITH cte AS
(SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
)
SELECT q3.Director_ID, cte.Actor_ID
FROM (SELECT cte.Director_ID, MAX(cte.Num_Joint_Movies) AS Max_Joint_Movies
FROM cte
GROUP BY cte.Director_ID
) AS q3
JOIN cte
ON q3.Director_ID = cte.Director_ID AND q3.Max_Joint_Movies = cte.Num_Joint_Movies
Q6:任何导演的最大合作次数
由于自从我开始回答以来问题已经有所改变,所以上面显示的结果可能并不完全符合要求——尽管修改后的问题并没有明确说明需要什么。然而,将问题分解为可回答的子查询的一般技术是有价值的。这就是我处理任何类似查询的方式。如果你正在寻找合作最多的导演+演员的单一组合,那么我们需要修改Q3以找到所有导演的最大联合电影数量:
SELECT MAX(n.num_joint_movies) AS max_joint_movies
FROM (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS num_joint_movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
) AS n
Q7:合作次数最多的演员和导演
我们现在需要再次将 Q6 与 Q2 结合起来:
SELECT q2.Director_ID, q2.Actor_ID
FROM (SELECT MAX(n.Num_Joint_Movies) AS Max_Joint_Movies
FROM (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
) AS n
) AS q3
JOIN (SELECT d.Director_ID, A.Actor_ID, COUNT(*) AS Num_Joint_Movies
FROM Movies_Directors AS d
JOIN Roles AS a
ON d.Movie_ID = a.Movie_ID
GROUP BY d.Director_ID, A.Actor_ID
) AS q2
ON q3.Max_Joint_Movies = q2.Num_Joint_Movies