假设我有这样的查询:
SELECT *
FROM clients c
INNER JOIN clients_balances cb ON cb.id_clients = c.id
LEFT JOIN clients com ON com.id = c.id_companies
LEFT JOIN clients com_real ON com_real.id = c.id_companies_real
LEFT JOIN rate_tables rt_orig ON rt_orig.id = c.orig_rate_table
LEFT JOIN rate_tables rt_term ON rt_term.id = c.term_rate_table
LEFT JOIN payment_terms pt ON pt.id = c.id_payment_terms
LEFT JOIN paygw_clients_profiles cpgw ON (cpgw.id_clients = c.id AND cpgw.id_companies = c.id_companies_real)
WHERE
EXISTS (SELECT * FROM accounts WHERE (name LIKE 'x' OR accname LIKE 'x' OR ani LIKE 'x') AND id_clients = c.id)
AND c."type" = '0'
AND c."id" > 0
ORDER BY c."name";
在生产环境中使用此查询大约需要 35 秒(“客户端”有大约 100 万条记录)。但是,如果我取出 ANY 连接 - 查询只需大约 300 毫秒即可执行。
我玩过查询计划器设置,但无济于事。
以下是一些解释分析输出:
http://explain.depesz.com/s/hzy (slow - 48049.574 ms)
http://explain.depesz.com/s/FWCd (fast - 286.234 ms, rate_tables JOIN removed)
http://explain.depesz.com/s/MyRf (fast - 539.733 ms, paygw_clients_profiles JOIN removed)
看起来在快速情况下,计划程序从 EXISTS 语句开始,并且总共只需要对两行执行连接。但是,在缓慢的情况下,它将首先连接所有表,然后按 EXISTS 过滤。
我需要做的是让这个查询在合理的时间内运行,同时所有七个连接都到位。
CentOS 6.3 上的 Postgres 版本是 9.3.10。
谢谢。
更新
像这样重写查询:
SELECT *
FROM clients c
INNER JOIN clients_balances cb ON cb.id_clients = c.id
INNER JOIN accounts a ON a.id_clients = c.id AND (a.name = 'x' OR a.accname = 'x' OR a.ani = 'x')
LEFT JOIN clients com ON com.id = c.id_companies
LEFT JOIN clients com_real ON com_real.id = c.id_companies_real
LEFT JOIN rate_tables rt_orig ON rt_orig.id = c.orig_rate_table
LEFT JOIN rate_tables rt_term ON rt_term.id = c.term_rate_table
LEFT JOIN payment_terms pt ON pt.id = c.id_payment_terms
LEFT JOIN paygw_clients_profiles cpgw ON (cpgw.id_clients = c.id AND cpgw.id_companies = c.id_companies_real)
WHERE
c."type" = '0' AND c.id > 0
ORDER BY c."name";
使它运行得很快,但是,这是不可接受的,因为帐户过滤参数是可选的,如果该表中没有匹配项,我仍然需要结果。使用“LEFT JOIN 帐户”而不是“INNER JOIN 帐户”会再次破坏性能。