mysql - 需要优化 SQL 查询的建议（在 MySQL 上更新）

Question

我使用慢查询日志对我的数据库进行了性能分析。原来这是第一个烦恼：

UPDATE
    t1
SET
  v1t1 =
  (
    SELECT
        t2.v3t2
    FROM
        t2
    WHERE
        t2.v2t2 = t1.v2t1
    AND t2.v1t2 <= '2012-04-24'
    ORDER BY
        t2.v1t2 DESC,
        t2.v3t2 DESC
    LIMIT 1
);

子查询本身已经很慢了。我尝试了 DISTINCT、GROUP BY 和更多子查询的变体，但在 4 秒以下没有执行任何操作。例如以下查询

SELECT v2t2, v3t2
FROM t2
WHERE t2.v1t2 <= '2012-04-24'
GROUP BY v2t2
ORDER BY v1t2 DESC

需要：

mysql> SELECT ...
...    
69054 rows in set (5.61 sec)    

mysql> EXPLAIN SELECT ...
+----+-------------+-------------+------+---------------+------+---------+------+---------+----------------------------------------------+
| id | select_type | table       | type | possible_keys | key  | key_len | ref  | rows    | Extra                                        |
+----+-------------+-------------+------+---------------+------+---------+------+---------+----------------------------------------------+
|  1 | SIMPLE      | t2          | ALL  | v1t2          | NULL | NULL    | NULL | 5203965 | Using where; Using temporary; Using filesort |
+----+-------------+-------------+------+---------------+------+---------+------+---------+----------------------------------------------+

mysql> SHOW CREATE TABLE t2;
...
  PRIMARY KEY (`v3t2`),
  KEY `v1t2_v3t2` (`v1t2`,`v3t2`),
  KEY `v1t2` (`v1t2`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8  

SELECT COUNT(*) FROM t1;
+----------+
| COUNT(*) |
+----------+
|    77070 |
+----------+

SELECT COUNT(*) FROM t2;
+----------+
| COUNT(*) |
+----------+
|  5203965 |
+----------+

我正在尝试获取最新条目（v3t2）及其父项（v2t2）。应该没什么大不了的吧？有没有人建议我应该转动哪些旋钮？非常感谢任何帮助或提示！

这应该是更合适的 SELECT 语句：

SELECT
    t1.v2t1,
  (
    SELECT
        t2.v3t2
    FROM
        t2
    WHERE
        t2.v2t2 = t1.v2t1
    AND t2.v1t2 <= '2012-04-24'
    ORDER BY
        t2.v1t2 DESC,
        t2.v3t2 DESC
    LIMIT 1
) AS latest   
FROM
    t1

score 1 · Accepted Answer

Your ORDER BY ... LIMIT 1 is forcing database to perform a full scan of the table to return only 1 row. It looks like very much as a candidate for indexing.

Before you build the index, check the fileds selectivity by running:

SELECT count(*), count(v1t2), count(DISTINCT v1t2) FROM t2;

If you're having high number of non-NULL values in your column and number of distinct values is more then 40% of the non-NULLs, then building index is a good thing to go.

If index provides no help, you should analyze the data in your columns. You're using t2.v1t2 <= '2012-04-24' condition, which, in the case you have a historical set of records in your table, will give nothing to the planner, as all rows are expected to be in the past, thus full scan is the best choice anyway. Thus, indexe is useless.

What you should do instead, is think how to rewrite your query in a way, that only a limited subset of records is checked. Your construct ORDER BY ... DESC LIMIT 1 shows that you probably want the most recent entry up to '2012-04-24' (including). Why don't you try to rewrite your query to a something like:

SELECT v2t2, v3t2
FROM t2
WHERE t2.v1t2 => date_add('2012-04-24' interval '-10' DAY)
GROUP BY v2t2
ORDER BY v1t2 DESC;

This is just an example, knowing the design of your database and nature of your data more precise query can be built.

score 0 · Accepted Answer

我会看看为子选择 t2 构建的索引。由于排序，您应该有一个 v2t2 的索引，可能还有一个 v1t2 和 v3t2 的索引。索引应该减少子选择在更新查询中使用它们之前查找结果的时间。

score 0 · Accepted Answer

这会更好吗？通过正在使用的键摆脱其中一种排序和组。

UPDATE
    t1
SET
  v1t1 =
  (
    SELECT
        MAX(t2.v3t2)
    FROM
        t2
    WHERE
        t2.v2t2 = t1.v2t1
    AND t2.v1t2 <= '2012-04-24'
    GROUP BY t2.v1t2
    ORDER BY t2.v1t2 DESC
    LIMIT 1
);

替代版本

UPDATE `t1`
SET `v1t1` = (
  SELECT MAX(`t2`.`v3t2`)
  FROM `t2`
  WHERE `t2`.`v2t2` = `t1`.`v2t1`
  AND `t2`.`v1t2` = (
    SELECT MAX(`t2`.`v1t2`)
    FROM `t2`
    WHERE `t2`.`v2t2` = `t1`.`v2t1
    AND `t2`.`v1t2` <= '2012-04-24'
    LIMIT 1
  )
  LIMIT 1
);

并将此索引添加到t2：

KEY `v2t2_v1t2` (`v2t2`, `v1t2`)

mysql - 需要优化 SQL 查询的建议（在 MySQL 上更新）

3 回答 3

替代版本

Related

Reference