MySQL 似乎无法使用 GROUP BY 子查询优化选择,并最终导致执行时间很长。对于这种常见场景,必须有已知的优化。
假设我们试图从数据库中返回所有订单,并带有一个标志,表明它是否是客户的第一个订单。
CREATE TABLE orders (order int, customer int, date date);
检索客户的第一笔订单非常快。
SELECT customer, min(order) as first_order FROM orders GROUP BY customer;
但是,一旦我们使用子查询将其与完整的订单集连接起来,它就会变得非常慢
SELECT order, first_order FROM orders LEFT JOIN (
SELECT customer, min(order) as first_order FROM orders GROUP BY customer
) AS first_orders ON orders.order=first_orders.first_order;
我希望我们缺少一个简单的技巧,否则它会快 1000 倍
CREATE TEMPORARY TABLE tmp_first_order AS
SELECT customer, min(order) as first_order FROM orders GROUP BY customer;
CREATE INDEX tmp_boost ON tmp_first_order (first_order)
SELECT order, first_order FROM orders LEFT JOIN tmp_first_order
ON orders.order=tmp_first_order.first_order;
编辑:
受@ruakh 提出的选项 3 的启发,使用 and 确实有一个不那么难看的解决方法INNER JOIN
,UNION
它具有可接受的性能但不需要临时表。但是,它对我们的案例有点特殊,我想知道是否存在更通用的优化。
SELECT order, "YES" as first FROM orders INNER JOIN (
SELECT min(order) as first_order FROM orders GROUP BY customer
) AS first_orders_1 ON orders.order=first_orders_1.first_order
UNION
SELECT order, "NO" as first FROM orders INNER JOIN (
SELECT customer, min(order) as first_order FROM orders GROUP BY customer
) AS first_orders_2 ON first_orders_2.customer = orders.customer
AND orders.order > first_orders_2.first_order;