我在 mysql 上有以下查询,查询逻辑是正确的,但是问题是因为有超过 10,000 多封求职者电子邮件和 24,000 多封访客电子邮件需要很长时间才能执行,有没有更好的方法来做到这一点?
SELECT g.email, g.name
FROM guest g
WHERE g.type='guest'
AND g.email NOT IN (SELECT email FROM seeker GROUP BY email)
GROUP BY g.email
尝试这个:
SELECT
g.email, g.name
FROM
guest g
LEFT JOIN
seeker s
ON
s.email = g.email
WHERE
g.type = 'guest'
AND
s.email IS NULL
GROUP BY
g.email;
SELECT DISTINCT g.email, g.name
FROM guest g
WHERE g.type='guest'
AND NOT EXISTS (SELECT 1 FROM seeker s WHERE g.email = s.email)
并确保您在 seeker.email、guest.type、guest.email 上有一个索引,如果列NOT NULL
在上面,那就太棒了。
您不需要按内部查询分组。您可以改为添加 DISTINCT。
SELECT g.email, g.name
FROM guest g
WHERE g.type='guest'
AND g.email NOT IN (SELECT DISTINCT email FROM seeker)
GROUP BY g.email
即使这样也行
SELECT g.email, g.name
FROM guest g left outer join seeker s on g.email = s.email
WHERE g.type='guest'
AND s.email is null
GROUP BY g.email
您的查询中会有很多字符串比较,如果您在表中索引电子邮件会有所帮助,尤其是。寻求者。
此外,避免使用未聚合且不存在于 GROUP BY 中的 SELECT 列。结果是不确定的。
服务器可以从每个组中自由选择任何值,因此除非它们相同,否则选择的值是不确定的。此外,从每个组中选择值不会受到添加 ORDER BY 子句的影响。
更多在手册中。
首先,对于您的查询,您不需要group by
:
SELECT g.email, g.name
FROM guest g
WHERE g.type = 'guest' AND g.email NOT IN (SELECT email FROM seeker)
GROUP BY g.email
这可能就足够了。使用上的索引seeker(email)
,以下内容应该可以优化:
SELECT g.email, g.name
FROM guest g
WHERE g.type = 'guest' AND
not exists (SELECT 1 FROM seeker where seeker.email = g.email)
GROUP BY g.email;
如果您在大多数表中都有很多重复项email
,那么我不会推荐这种left join
方法。
SELECT DISTINCT g.email, g.name
FROM guest g
LEFT OUTER seeker s ON s.email = g.email
WHERE g.type='guest' AND s.email IS NULL