3

I am trying to do a simple sql query:

SELECT DISTINCT id
FROM marketing
WHERE type = 'email'
  AND id NOT IN (
                SELECT id
                FROM marketing
                WHERE type = 'letter'
                )
ORDER BY id;

It takes a really long time to run, and I assume it has to do with the select in the where statement (There are a large number of ids), but I can't come up with a way to improve it.

First can this be the reason the query is so slow, and second any suggestion on how to improve it?

Edit:

Database System: MySql

Id is indexed but but is not a primary key in this table; it is a foreign key.

4

4 回答 4

2
select distinct id
from   marketing a
where  type = 'email'
and    not exists (
           select 'X'
           from   marketing b
           where  a.id = b.id
           and    type = 'letter' )
order by id
于 2013-05-09T20:11:23.153 回答
2

这种类型的查询有一个已知的模式:获取与另一组不匹配的所有行。

select id from marketing m1
left outer join marketing m2 on m1.id = m2.id and m2.type = 'letter'
where m1.type = 'email' and m2.id IS NULL

这将获得市场营销中所有类型为“电子邮件”的行,并且不存在匹配类型为“字母”的 id。如果您想要另一组,请使用 IS NOT NULL。id 列上的正确索引是最大执行速度所需要的,类型为覆盖列。

于 2013-05-09T20:03:41.880 回答
1

您还可以将此查询表述为聚合查询。您正在寻找的条件是 anid至少有一行 wheretype = 'email'并且没有 where type = 'letter'

select id
from marketing m
group by id
having SUM(case when type = 'letter' then 1 else 0 end) = 0 and
       SUM(case when type = 'email' then 1 else 0 end) > 0

有一个索引,这个查询可能会运行得更快marketing(id, type)order by id在 MySQL 中是多余的,因为它group by进行排序。

于 2013-05-09T20:13:56.210 回答
1

这是您查询的替代方法,尽管根据 Quassnoi here (MySQL)的说法,它应该执行类似的操作。

   select email.id
     from marketing email
left join marketing letter on letter.type='letter' and letter.id=email.id
    where email.type='email' and letter.id is null
 group by email.id
 order by email.id;

编写此类查询的三种主要方式是 NOT IN、NOT EXISTS(相关)或 LEFT JOIN/IS NULL。Quassnoi 将它们与 MySQL(上面的链接)、SQL ServerOraclePostgreSQL进行比较。

于 2013-05-09T20:02:01.323 回答