我们已将数据库服务器从 MySQL 5.1 升级到 MariaDB 5.5 (5.5.40-MariaDB-1~wheezy-log)。
此次升级后,一些长时间运行的进程 mysql 连接 si 被丢弃。这些流程的常见场景是:
- 连接到 MySQL
- 运行一些查询
- 在至少一分钟内不连接 MySQL 的情况下做一些繁重的工作
- 尝试查询原始连接
- 出现 2600 错误的异常 - MySQL 服务器已消失
这确实发生在 PHP CLI 脚本 (php 5.3) 中,但也发生在 Ruby 应用程序 (Redmine 2.5.1) 中。MySQL 5.1 没有发生这种情况,应用程序端没有任何变化,因此它不应该与应用程序相关。
MariaDB 中的 %timeout% 变量是:
+----------------------------+----------+
| Variable_name | Value |
+----------------------------+----------+
| connect_timeout | 5 |
| deadlock_timeout_long | 50000000 |
| deadlock_timeout_short | 10000 |
| delayed_insert_timeout | 300 |
| innodb_lock_wait_timeout | 50 |
| innodb_rollback_on_timeout | OFF |
| interactive_timeout | 28800 |
| lock_wait_timeout | 31536000 |
| net_read_timeout | 30 |
| net_write_timeout | 60 |
| slave_net_timeout | 3600 |
| thread_pool_idle_timeout | 60 |
| wait_timeout | 28800 |
+----------------------------+----------+
我们没有使用线程池:
+---------------------------+---------------------------+
| Variable_name | Value |
+---------------------------+---------------------------+
| thread_cache_size | 128 |
| thread_concurrency | 10 |
| thread_handling | one-thread-per-connection |
| thread_pool_idle_timeout | 60 |
| thread_pool_max_threads | 500 |
| thread_pool_oversubscribe | 3 |
| thread_pool_size | 12 |
| thread_pool_stall_limit | 500 |
| thread_stack | 294912 |
+---------------------------+---------------------------+
当事情发生时,系统日志中还会记录一个事件,每次看起来都一样:
Dec 16 13:00:14 DB01 mysqld: 141216 13:00:14 [Warning] Aborted connection 9202885 to db: 'some_db_name' user: 'user' host: 'app' (Unknown error)
除此之外,还有奇怪的根帐户断开消息:
Dec 16 13:05:02 DB01 mysqld: 141216 13:05:02 [Warning] Aborted connection 9225621 to db: 'unconnected' user: 'root' host: 'localhost' (Unknown error)
Dec 16 13:10:00 DB01 mysqld: 141216 13:10:00 [Warning] Aborted connection 9218291 to db: 'unconnected' user: 'root' host: 'localhost' (Unknown error)
Dec 16 13:10:12 DB01 mysqld: 141216 13:10:12 [Warning] Aborted connection 9232561 to db: 'unconnected' user: 'root' host: 'localhost' (Unknown error)
Dec 16 13:17:01 DB01 /USR/SBIN/CRON[41343]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Dec 16 13:20:02 DB01 mysqld: 141216 13:20:02 [Warning] Aborted connection 9248777 to db: 'unconnected' user: 'root' host: 'localhost' (Unknown error)
Dec 16 13:20:02 DB01 mysqld: 141216 13:20:02 [Warning] Aborted connection 9248788 to db: 'unconnected' user: 'root' host: 'localhost' (Unknown error)
Dec 16 13:20:12 DB01 mysqld: 141216 13:20:12 [Warning] Aborted connection 9248798 to db: 'unconnected' user: 'root' host: 'localhost' (Unknown error)
在这些设置中是否有任何应该更改以修复奇怪的服务器已消失的错误?