mysql - 具有大量记录的 MySQL 查询被杀死

Question

我从我的 shell 运行以下查询：

    mysql -h my-host.net -u myuser -p -e "SELECT component_id, parent_component_id FROM myschema.components comp INNER JOIN my_second_schema.component_parents related_comp ON comp.id = related_comp.component_id ORDER BY component_id;" > /tmp/IT_component_parents.txt

查询运行了很长时间，然后被杀死。

但是，如果我添加LIMIT 1000，则查询将一直运行到最后，并且输出将写入文件中。

我进一步调查发现（使用 COUNT(*)）将返回的记录总数为 239553163。

关于我的服务器的一些信息在这里：

MySQL 5.5.27

    +----------------------------+----------+
    | Variable_name              | Value    |
    +----------------------------+----------+
    | connect_timeout            | 10       |
    | delayed_insert_timeout     | 300      |
    | innodb_lock_wait_timeout   | 50       |
    | innodb_rollback_on_timeout | OFF      |
    | interactive_timeout        | 28800    |
    | lock_wait_timeout          | 31536000 |
    | net_read_timeout           | 30       |
    | net_write_timeout          | 60       |
    | slave_net_timeout          | 3600     |
    | wait_timeout               | 28800    |
    +----------------------------+----------+

这是我监控的查询状态：

    copying to tmp table on disk
    sorting results
    sending data
    writing to net
    sending data
    writing to net
    sending data
    writing to net
    sending data ...
    KILLED

任何猜测这里有什么问题？

score 21 · Accepted Answer

21

mysql 客户端可能内存不足。

使用 --quick 选项不在内存中缓冲结果。

于 2015-10-14T15:07:28.687 回答

score 1 · Accepted Answer

问题是您返回 239 553 163 行数据！不要惊讶它需要很多时间来处理。实际上，最长的部分很可能是将结果集发送回您的客户端。

减少结果集（你真的需要所有这些行吗？）。或者尝试以较小的批次输出数据：

mysql -h my-host.net -u myuser -p -e "SELECT ... LIMIT 10000, 0" >> dump.txt
mysql -h my-host.net -u myuser -p -e "SELECT ... LIMIT 10000, 10000" >> dump.txt

score 1 · Accepted Answer

Assuming you mean 8 hours when you say a long time, the value 28800 for your wait_timeout causes the connection to drop with no further activity in 28,800 seconds, i.e. 8 hours. If you can't optimize the statement to run in less than 8 hours, you should increase this value.

See this page for further information on the wait_timeout variable.

The interactive_timeout variable is used for interactive client connections, so if you run long queries from an interactive session, that's the one you need look at.

score 0 · Accepted Answer

如果您要转储大量数据，您可能需要使用OUTFILE mechanizm。那或 mysql_dump 将更有效率（并且 OUTFILE 获得了不锁定表的好处）。

score 0 · Accepted Answer

您在评论中说您的 MySQL 实例在 RDS 上。这意味着您无法从同一主机运行查询，因为您无法登录 RDS 主机。我猜您可能是通过本地网络的 WAN 执行此查询。

由于网络速度较慢，您很可能会遇到麻烦。您的进程状态经常显示“写入网络”让我认为这是您的瓶颈。

您的瓶颈也可能是排序。您的排序正在写入临时表，对于这么大的结果集可能需要很长时间。你可以跳过 ORDER BY 吗？

即便如此，即使查询运行 3100 秒或更长时间，我也不希望查询被终止。我想知道您的 DBA 是否有一些周期性的工作会杀死长时间运行的查询，例如pt-kill。询问您的 DBA。

为了减少网络传输时间，您可以尝试使用压缩协议。您可以为此使用 mysql 客户端的--compressor标志（请参阅https://dev.mysql.com/doc/refman/5.7/en/mysql-command-options.html#option_mysql_compress）-C

在慢速网络上，压缩会有所帮助。例如，在此处阅读一些比较：https ://www.percona.com/blog/2007/12/20/large-result-sets-vs-compression-protocol/

另一种选择是从与您的 RDS 实例在同一 AZ 中运行的 EC2 现货实例运行查询。这两个实例之间的网络会快很多，因此不会延迟您的数据传输。将查询输出保存到 EC2 Spot 实例上的文件中。

一旦查询结果保存在您的 EC2 实例上，您就可以将其下载到本地机器，使用scp或其他东西，应该更能容忍慢速网络。

mysql - 具有大量记录的 MySQL 查询被杀死

5 回答 5

Related

Reference