sql - 为什么未使用的 FROM 表会使语句变慢

Question

我有两张桌子。一个变更日志表，其细节在这里并不重要。另一个包含卖家信息的表，最重要的是有两列：

主 ID
和一个名为 ident 的 id（= 真实世界的 ID）。

当卖家更改时，会创建一个新条目，ident 保持不变，但新条目会获得一个新 ID。我在 ID 上有一个主索引，在 (ident,-id) 上有另一个索引，所以我可以快速获取当前数据。

我偶然发现了以下奇怪的行为：

这需要很长时间才能完成：

SELECT DISTINCT ON (ident) sellers.* FROM changelog, sellers ORDER BY ident,id DESC;

                                  QUERY PLAN                                    
---------------------------------------------------------------------------------
 Unique  (cost=741675.98..760122.47 rows=10 width=30)
   ->  Sort  (cost=741675.98..750899.22 rows=3689298 width=30)
         Sort Key: sellers.ident, sellers.id
         ->  Nested Loop  (cost=3.07..74457.37 rows=3689298 width=30)
               ->  Seq Scan on changelog  (cost=0.00..668.34 rows=38034 width=0)
               ->  Materialize  (cost=3.07..4.04 rows=97 width=30)
                     ->  Seq Scan on sellers  (cost=0.00..2.97 rows=97 width=30)

当我用 -ID 替换 DESC 时，它很快，但产生相同的结果。

SELECT DISTINCT ON (ident) sellers.* FROM changelog, sellers ORDER BY ident,-id;

                                         QUERY PLAN                                        
------------------------------------------------------------------------------------------
 Unique  (cost=706.37..92956.53 rows=10 width=30)
   ->  Nested Loop  (cost=706.37..83733.28 rows=3689298 width=30)
         ->  Index Scan using idx_sellers on sellers  (cost=0.00..17.70 rows=97 width=30)
         ->  Materialize  (cost=706.37..1086.71 rows=38034 width=0)
               ->  Seq Scan on changelog  (cost=0.00..668.34 rows=38034 width=0)

当我从 FROM 中删除“更改日志”时，ORDER BY -id 和 DESC 再次快速给出相同的查询计划。

SELECT DISTINCT ON (ident) sellers.* FROM sellers ORDER BY ident,id, DESC   

                             QUERY PLAN                              
---------------------------------------------------------------------
 Unique  (cost=6.17..6.66 rows=10 width=30)
   ->  Sort  (cost=6.17..6.41 rows=97 width=30)
         Sort Key: ident, id
         ->  Seq Scan on sellers  (cost=0.00..2.97 rows=97 width=30)

我的问题：

为什么在 FROM 中包含未使用的表会影响查询？
为什么 ORDER BY ident,-id 不使用与 ORDER BY ident,id DESC 相同的计划？

编辑：我的真实查询当然有一个 WHERE 子句来连接这两个表。

score 9 · Accepted Answer

查询中不存在未使用的表。

SELECT DISTINCT ON (ident) sellers.* 
FROM changelog, sellers ORDER BY ident,id DESC

您的原始查询正在创建交叉连接（更改日志中的每条记录都被连接到卖家中的每条记录），这是您应该避免使用 20 年前被显式语法取代的隐含连接语法的原因之一。

sql - 为什么未使用的 FROM 表会使语句变慢

1 回答 1

Related

Reference