我偶然发现了一段执行缓慢的代码,看起来很像这样。
SELECT
res.[X],
res.[Y],
SUM(res.[Z]) -- This is SUM so I have to remove duplicates
FROM (
SELECT DISTINCT a.[X], a.[Y], b.[Z] FROM [A] a JOIN [B] b ON a.[ID] = b.[ID]
UNION
SELECT DISTINCT a.[X], a.[Y], c.[Z] FROM [A] a JOIN [C] c ON a.[ID] = c.[ID]
UNION
SELECT DISTINCT a.[X], a.[Y], d.[Z] FROM [A] a JOIN [D] d ON a.[ID] = d.[ID]
UNION ALL -- This set won't have duplicates, hence the UNION ALL in this case
SELECT a.[X], a.[Y], n.[Z] FROM [A] a JOIN [N] n ON a.[ID] = n.[ID]
) res
GROUP BY res.[X], res.[Y]
联接要复杂得多,其中有 12 个 UNION/UNION ALL,但您明白了。每个结果集通常包含 1 到 1500 万行。
我想知道其他人将如何编写此查询。我阅读了其他几个警告的线程:
SELECT DISTINCT * FROM [A]
UNION
SELECT DISTINCT * FROM [B]
因为 DISTINCT 被调用了 3 次(在这个小例子中)。所以我试了一下并删除了 DISTINCT。结果实际上慢了很多。我不明白删除额外的过滤会如何导致查询运行速度变慢。
有没有人有任何想法?我正在研究查询计划,但它太大而无法发布,所以我只是在寻找建议。谢谢!