我通过将 temp 重命名为 t1 并将 temp1 重命名为 t2 稍微改写了您的查询,以便可以更清楚地解释问题。
你的问题可以从两方面来回答:
1)我假设您使用的是旧版本的 hawq,其中查询在优化器打开时工作,而在优化器关闭时不起作用。在最新的 apache hawq中,它实际上不适用于优化器的开/关。原因是当优化器打开时它现在回退到规划器。
2)查询失败是由于查询执行器的错误。
具体来说,把CTE(公用表表达式)子句中的t2当做子查询扫描然后物化是正确的。
但是,在评估 t2 时,它不会生成元组。因此,t1 与 t2 的连接不会生成元组。
我们可以在使用 optimizer = off 运行查询时使用解释分析在查询执行统计中看到这一点。
-> Materialize (cost=0.00..0.01 rows=1 width=0)
Rows out: 0 rows with 0.167 ms to end of 3 scans, start offset by 0.265 ms.
-> Limit (cost=0.00..0.00 rows=1 width=0)
Rows out: 0 rows with 0.003 ms to end, start offset by 0.257 ms.
-> Subquery Scan t2 (cost=0.00..0.01 rows=1 width=0)
Rows out: 0 rows with 0.002 ms to end, start offset by 0.258 ms.
-> Result (cost=0.00..0.01 rows=1 width=0)
One-Time Filter: 3 = $0
Rows out: 0 rows with 0.001 ms to end, start offset by 0.258 ms.
我们有HAWQ-884跟踪此问题,您可以参考它以获取更新和详细信息。
PS:这里是关于查询执行统计的详细信息:
1) 关闭优化器的 hawq 2.0 (planner)
show optimizer;
optimizer
-----------
off
(1 row)
WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
c1 | c2
----+----
(0 rows)
EXPLAIN ANALYZE
WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.05..0.29 rows=72 width=8)
Rows out: Avg 0.0 rows x 0 workers. Max/Last(/) 0/0 rows with 0.237/0.237 ms to end.
-> Limit (cost=0.00..0.00 rows=1 width=0)
Rows out: Avg 0.0 rows x 0 workers. Max/Last(/) 0/0 rows with 0.003/0.003 ms to end.
-> Subquery Scan t2 (cost=0.00..0.01 rows=6 width=0)
Rows out: Avg 0.0 rows x 0 workers. Max/Last(/) 0/0 rows with 0.002/0.002 ms to end.
-> Result (cost=0.00..0.01 rows=1 width=0)
One-Time Filter: 3 = $0
Rows out: Avg 0.0 rows x 0 workers. Max/Last(/) 0/0 rows with 0/0 ms to end.
-> Materialize (cost=0.05..0.17 rows=12 width=8)
Rows out: Avg 1.0 rows x 1 workers. Max/Last(/) 1/1 rows with 0.129/0.129 ms to end, start offset by 0.135/0.135 ms.
-> Append (cost=0.00..0.04 rows=2 width=0)
Rows out: Avg 2.0 rows x 1 workers. Max/Last(/) 2/2 rows with 0.002/0.002 ms to first row, 0.004/0.004 ms to end, start offset by 0.255/0.255 ms.
-> Result (cost=0.00..0.01 rows=1 width=0)
Rows out: Avg 1.0 rows x 1 workers. Max/Last(/) 1/1 rows with 0.002/0.002 ms to end, start offset by 0.255/0.255 ms.
-> Result (cost=0.00..0.01 rows=1 width=0)
Rows out: Avg 1.0 rows x 1 workers. Max/Last(/) 1/1 rows with 0/0 ms to end, start offset by 0.261/0.261 ms.
Slice statistics:
(slice0) Executor memory: 61K bytes.
Statement statistics:
Memory used: 128000K bytes
Settings: default_hash_table_bucket_number=6; optimizer=off
Optimizer status: legacy query optimizer
Data locality statistics:
no data locality information in this query
Total runtime: 0.372 ms
(26 rows)
2) 带有优化器的 hawq 2.0 (orca)
show optimizer;
optimizer
-----------
on
(1 row)
WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
c1 | c2
----+----
(0 rows)
EXPLAIN ANALYZE
WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.05..0.29 rows=72 width=8)
Rows out: Avg 0.0 rows x 0 workers. Max/Last(/) 0/0 rows with 0.273/0.273 ms to end.
-> Limit (cost=0.00..0.00 rows=1 width=0)
Rows out: Avg 0.0 rows x 0 workers. Max/Last(/) 0/0 rows with 0.003/0.003 ms to end.
-> Subquery Scan t2 (cost=0.00..0.01 rows=6 width=0)
Rows out: Avg 0.0 rows x 0 workers. Max/Last(/) 0/0 rows with 0.002/0.002 ms to end.
-> Result (cost=0.00..0.01 rows=1 width=0)
One-Time Filter: 3 = $0
Rows out: Avg 0.0 rows x 0 workers. Max/Last(/) 0/0 rows with 0.001/0.001 ms to end.
-> Materialize (cost=0.05..0.17 rows=12 width=8)
Rows out: Avg 1.0 rows x 1 workers. Max/Last(/) 1/1 rows with 0.151/0.151 ms to end, start offset by 0.189/0.189 ms.
-> Append (cost=0.00..0.04 rows=2 width=0)
Rows out: Avg 2.0 rows x 1 workers. Max/Last(/) 2/2 rows with 0.003/0.003 ms to first row, 0.004/0.004 ms to end, start offset by 0.327/0.327 ms.
-> Result (cost=0.00..0.01 rows=1 width=0)
Rows out: Avg 1.0 rows x 1 workers. Max/Last(/) 1/1 rows with 0.002/0.002 ms to end, start offset by 0.327/0.327 ms.
-> Result (cost=0.00..0.01 rows=1 width=0)
Rows out: Avg 1.0 rows x 1 workers. Max/Last(/) 1/1 rows with 0/0 ms to end, start offset by 0.337/0.337 ms.
Slice statistics:
(slice0) Executor memory: 61K bytes.
Statement statistics:
Memory used: 128000K bytes
Settings: default_hash_table_bucket_number=6
Optimizer status: legacy query optimizer
Data locality statistics:
no data locality information in this query
Total runtime: 0.468 ms
(26 rows)
3) 关闭优化器的 hawq 1.x (planner)
show optimizer;
optimizer
-----------
off
(1 row)
WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
c1 | c2
----+----
(0 rows)
EXPLAIN ANALYZE
WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
QUERY PLAN
--------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..0.08 rows=4 width=8)
Rows out: 0 rows with 0.220 ms to end, start offset by 0.093 ms.
-> Append (cost=0.00..0.04 rows=2 width=0)
Rows out: 2 rows with 0.001 ms to first row, 0.003 ms to end, start offset by 0.262 ms.
-> Result (cost=0.00..0.01 rows=1 width=0)
Rows out: 1 rows with 0.001 ms to end, start offset by 0.262 ms.
-> Result (cost=0.00..0.01 rows=1 width=0)
Rows out: 1 rows with 0.001 ms to end, start offset by 0.264 ms.
-> Materialize (cost=0.00..0.01 rows=1 width=0)
Rows out: 0 rows with 0.167 ms to end of 3 scans, start offset by 0.265 ms.
-> Limit (cost=0.00..0.00 rows=1 width=0)
Rows out: 0 rows with 0.003 ms to end, start offset by 0.257 ms.
-> Subquery Scan t2 (cost=0.00..0.01 rows=1 width=0)
Rows out: 0 rows with 0.002 ms to end, start offset by 0.258 ms.
-> Result (cost=0.00..0.01 rows=1 width=0)
One-Time Filter: 3 = $0
Rows out: 0 rows with 0.001 ms to end, start offset by 0.258 ms.
Slice statistics:
(slice0) Executor memory: 61K bytes.
Statement statistics:
Memory used: 128000K bytes
Settings: optimizer=off
Optimizer status: legacy query optimizer
Total runtime: 0.315 ms
(24 rows)
4) 带有优化器的 hawq 1.x (orca)
show optimizer;
optimizer
-----------
on
(1 row)
WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
c1 | c2
----+----
3 | 4
(1 row)
EXPLAIN ANALYZE
WITH t1 AS ( SELECT 1 c1, 2 c2 UNION ALL SELECT 3 c1, 4 c2 ),
t2 AS ( SELECT 3 c3 )
SELECT * FROM t1
WHERE EXISTS (SELECT * FROM t2 WHERE c3=c1);
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Hash EXISTS Join (cost=0.00..0.00 rows=2 width=8)
Hash Cond: "outer".c1 = "inner".c3
Rows out: 1 rows with 0.835 ms to first row, 2.373 ms to end, start offset by 0.194 ms.
Executor memory: 1K bytes.
Work_mem used: 1K bytes. Workfile: (0 spilling, 0 reused)
Hash chain length 1.0 avg, 1 max, using 1 of 524341 buckets.
-> Append (cost=0.00..0.00 rows=2 width=8)
Rows out: 2 rows with 0.002 ms to first row, 0.004 ms to end, start offset by 1.023 ms.
-> Result (cost=0.00..0.00 rows=1 width=8)
Rows out: 1 rows with 0.001 ms to first row, 0.002 ms to end, start offset by 1.023 ms.
-> Result (cost=0.00..0.00 rows=1 width=1)
Rows out: 1 rows with 0 ms to end, start offset by 1.024 ms.
-> Result (cost=0.00..0.00 rows=1 width=8)
Rows out: 1 rows with 0.001 ms to end, start offset by 1.026 ms.
-> Result (cost=0.00..0.00 rows=1 width=1)
Rows out: 1 rows with 0 ms to end, start offset by 1.026 ms.
-> Hash (cost=0.00..0.00 rows=1 width=4)
Rows in: 1 rows with 0.010 ms to end, start offset by 1.013 ms.
-> Result (cost=0.00..0.00 rows=1 width=4)
Rows out: 1 rows with 0.006 ms to first row, 0.007 ms to end, start offset by 1.013 ms.
-> Result (cost=0.00..0.00 rows=1 width=1)
Rows out: 1 rows with 0.001 ms to end, start offset by 1.014 ms.
Slice statistics:
(slice0) Executor memory: 8270K bytes. Work_mem: 1K bytes max.
Statement statistics:
Memory used: 128000K bytes
Optimizer status: PQO version 1.591
Total runtime: 2.572 ms
(28 rows)