我正在学习 SQL(使用 SQLite 3 及其sqlite3命令行工具),我注意到我可以通过多种方式做一些事情,有时并不清楚哪一种更好。这里有三个查询,它们做同样的事情,一个通过执行intersect
,另一个通过inner join
and distinct
,最后一个类似于第二个,但它包含过滤 through where
。(第一个是我正在阅读的书的作者写的,其他是我自己写的。)
问题是,这些查询中哪个更好,为什么?而且,更一般地说,我怎么知道一个查询何时比另一个更好?是否有一些我错过的指导方针,或者尽管 SQL 具有声明性,我还是应该学习 SQLite 内部原理?
(在下面的示例中,有一些表描述了某些电视剧中提到的食物名称。Foods_episodes 是多对多链接表,而另一些则描述食物名称和剧集名称以及季节编号。请注意,历史前十正在寻找食物(基于他们在所有系列中出现的次数),而不仅仅是第 3..5 季中的顶级食物)
-- task
-- find the all-time top ten foods that appear in seasons 3 through 5
-- schema
-- CREATE TABLE episodes (
-- id integer primary key,
-- season int,
-- name text );
-- CREATE TABLE foods(
-- id integer primary key,
-- name text );
-- CREATE TABLE foods_episodes(
-- food_id integer,
-- episode_id integer );
select f.* from foods f
inner join
(select food_id, count(food_id) as count
from foods_episodes
group by food_id
order by count(food_id) desc limit 10) top_foods
on f.id=top_foods.food_id
intersect
select f.* from foods f
inner join foods_episodes fe on f.id = fe.food_id
inner join episodes e on fe.episode_id = e.id
where
e.season between 3 and 5
order by
f.name;
select
distinct f.*
from
foods_episodes as fe
inner join episodes as e on e.id = fe.episode_id
inner join foods as f on fe.food_id = f.id
inner join (select food_id from foods_episodes
group by food_id order by count(*) desc limit 10) as lol
on lol.food_id = fe.food_id
where
e.season between 3 and 5
order by
f.name;
select
distinct f.*
from
foods_episodes as fe
inner join episodes as e on e.id = fe.episode_id
inner join foods as f on fe.food_id = f.id
where
fe.food_id in (select food_id from foods_episodes
group by food_id order by count(*) desc limit 10)
and e.season between 3 and 5
order by
f.name;
-- output (same for these thee):
-- id name
-- ---------- ----------
-- 4 Bear Claws
-- 146 Decaf Capp
-- 153 Hennigen's
-- 55 Kasha
-- 94 Ketchup
-- 164 Naya Water
-- 317 Pizza
-- CPU Time: user 0.000000 sys 0.000000