我正在优化一些 SQL 查询(这可以被认为是我最近发布的问题的第 2 部分)并将一些 NOT IN 替换为 NOT EXISTS 谓词
我是否认为这样做的主要好处是,使用 NOT EXISTS 您可以获得这样的好处,即当找到单个匹配项时语句将终止,但 NOT IN 与计数子查询将不得不进行全表扫描?
如果选择的数据包含 NULL,似乎 NOT IN 也需要额外的工作,这是正确的吗?
在我在 proc 中实现它们之前,我需要确保在这两种情况下,第二个语句比第一个语句更好(并且在功能上等效):
情况1:
--exclude sessions that were tracked as part of a conversion during the last response_time minutes
-- AND session_id NOT IN (SELECT DISTINCT tracked_session_id
-- FROM data.conversions WITH (NOLOCK)
-- WHERE client_id = @client_id
-- AND utc_date_completed >= DATEADD(minute, (-2) * cy.response_time, @date)
-- AND utc_date_completed <= @date
-- AND utc_date_clicked <= @date)
AND NOT EXISTS (SELECT 1
FROM data.conversions WITH (NOLOCK)
WHERE client_id = @client_id
AND utc_date_completed >= DATEADD(minute, (-2) * cy.response_time, @date)
AND utc_date_completed <= @date
AND utc_date_clicked <= @date
AND data.conversions.tracked_session_id = d.session_id
)
案例二:
-- NOT EXISTS vs full table scan with COUNT(dashboard_id)
-- AND (SELECT COUNT(dashboard_id)
-- FROM data.dashboard_responses WITH(NOLOCK)
-- WHERE session_id = d.session_id
-- AND cycle_id = cy.id
-- AND client_id = @client_id) = 0
AND NOT EXISTS(SELECT 1
FROM data.dashboard_responses
WHERE session_id = d.session_id
AND cycle_id = cy.id
AND client_id = @client_id)
干杯