3

我在 Debian 6.0 上的 MySQL(0.5GB 数据库)中使用 InnoDB。

今天早上(经过 120 天的完美正常运行时间),服务器突然挂起(实际上服务器在过去两天第二次挂起),我不得不通过主机 CP 重新启动它。

在 kern.log 中,我有这个:

Apr 16 09:43:35 lhost kernel: [76680.208104] INFO: task mysqld:9755 blocked for more than 120 seconds.
Apr 16 09:43:35 lhost kernel: [76680.233913] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 16 09:43:35 lhost kernel: [76680.234249] mysqld        D ffff88007dc67170     0  9755   1241 0x00000000
Apr 16 09:43:35 lhost kernel: [76680.234472]  ffff88007dc67170 0000000000000086 0000000000000000 ffffffffa021c4c2
Apr 16 09:43:35 lhost kernel: [76680.234815]  ffff880009ff1ad8 ffff880009ff1ad8 000000000000f9e0 ffff880009ff1fd8
Apr 16 09:43:35 lhost kernel: [76680.235181]  0000000000015780 0000000000015780 ffff88007d177170 ffff88007d177468
Apr 16 09:43:35 lhost kernel: [76680.235549] Call Trace:
Apr 16 09:43:35 lhost kernel: [76680.235732]  [<ffffffffa021c4c2>] ? ext4_mark_iloc_dirty+0x46e/0x4f7 [ext4]
Apr 16 09:43:35 lhost kernel: [76680.235963]  [<ffffffffa021cd61>] ? ext4_mark_inode_dirty+0x18f/0x1c4 [ext4]
Apr 16 09:43:35 lhost kernel: [76680.236212]  [<ffffffff812fc2ab>] ? __mutex_lock_common+0x122/0x192
Apr 16 09:43:35 lhost kernel: [76680.236422]  [<ffffffff812fc3d3>] ? mutex_lock+0x1a/0x31
Apr 16 09:43:35 lhost kernel: [76680.236626]  [<ffffffff81113c8c>] ? __blockdev_direct_IO+0x1d3/0xa60
Apr 16 09:43:35 lhost kernel: [76680.236852]  [<ffffffff810b51c2>] ? generic_file_buffered_write+0x1f5/0x278
Apr 16 09:43:35 lhost kernel: [76680.237097]  [<ffffffffa021da26>] ? ext4_ind_direct_IO+0x10e/0x1be [ext4]
Apr 16 09:43:35 lhost kernel: [76680.237320]  [<ffffffffa02205c9>] ? ext4_get_block+0x0/0xe2 [ext4]
Apr 16 09:43:35 lhost kernel: [76680.237528]  [<ffffffff810b57bf>] ? generic_file_aio_read+0xf6/0x536
Apr 16 09:43:35 lhost kernel: [76680.237794]  [<ffffffff81071dbf>] ? wake_futex+0x31/0x4e
Apr 16 09:43:35 lhost kernel: [76680.237984]  [<ffffffff810ef061>] ? do_sync_read+0xce/0x113
Apr 16 09:43:35 lhost kernel: [76680.238278]  [<ffffffff8100f79c>] ? __switch_to+0x285/0x297
Apr 16 09:43:35 lhost kernel: [76680.238471]  [<ffffffff81065046>] ? autoremove_wake_function+0x0/0x2e
Apr 16 09:43:35 lhost kernel: [76680.238687]  [<ffffffff81048282>] ? finish_task_switch+0x3a/0xaf
Apr 16 09:43:35 lhost kernel: [76680.238897]  [<ffffffff810efa84>] ? vfs_read+0xa6/0xff
Apr 16 09:43:35 lhost kernel: [76680.239090]  [<ffffffff810efb34>] ? sys_pread64+0x57/0x77
Apr 16 09:43:35 lhost kernel: [76680.239290]  [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b

就在完全停电之前:

Apr 16 09:45:32 lhost kernel: [76800.253284]  [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
Apr 16 09:56:29 lhost kernel: [77453.643191] Clocksource tsc unstable (delta = -788398275 ns)
Apr 16 10:01:01 lhost kernel: imklog 4.6.4, log source = /proc/kmsg started.

所以我决定打开慢查询日志,这就是记录的内容:

# Time: 130416 13:38:41
# User@Host: lhost[lhost] @ localhost []
# Query_time: 2.059521  Lock_time: 0.000026 Rows_sent: 0  Rows_examined: 1
use lhost;
SET timestamp=1366105121;
UPDATE t_fuser SET DATE_UPDATE=now()  WHERE ID = 992277;
# Time: 130416 13:56:42
# User@Host: root[root] @ localhost []
# Query_time: 3.663760  Lock_time: 0.000037 Rows_sent: 230  Rows_examined: 230
SET timestamp=1366106202;
SHOW TABLE STATUS FROM `lhost`;
# User@Host: root[root] @ localhost []
# Query_time: 3.693291  Lock_time: 0.000031 Rows_sent: 230  Rows_examined: 230
SET timestamp=1366106202;
SHOW TABLE STATUS FROM `lhost`;
# Time: 130416 14:00:19
# User@Host: lhost[lhost] @ localhost []
# Query_time: 7.139369  Lock_time: 0.000210 Rows_sent: 122  Rows_examined: 3027170

这对我来说真的很奇怪。我尝试直接通过控制台执行类似的查询(此日志中有更复杂的查询,但这不是重点)并且更快地获得结果,因此它不是索引。

我能做些什么来解决这个问题?是硬件吗?

# free -m
             total       used       free     shared    buffers     cached
Mem:          2012       1355        657          0         40        627
-/+ buffers/cache:        687       1325
Swap:         2047        129       1918

提前感谢您的建议。

附加数据:

CPU 使用率 - munin 内存使用 - munin

慢查询简介:

mysql> show profile;
+--------------------------------+----------+
| Status                         | Duration |
+--------------------------------+----------+
| starting                       | 0.000020 |
| checking query cache for query | 0.000111 |
| checking permissions           | 0.000003 |
| checking permissions           | 0.000001 |
| checking permissions           | 0.000002 |
| checking permissions           | 0.000001 |
| checking permissions           | 0.000003 |
| Opening tables                 | 0.000032 |
| System lock                    | 0.000004 |
| Table lock                     | 0.000006 |
| init                           | 0.000074 |
| optimizing                     | 0.000020 |
| statistics                     | 0.000171 |
| preparing                      | 0.000027 |
| Creating tmp table             | 0.000160 |
| executing                      | 0.000003 |
| Copying to tmp table           | 4.572503 |
| Creating sort index            | 0.009622 |
| Copying to group table         | 0.029744 |
| Removing duplicates            | 0.010745 |
| Sorting result                 | 0.000086 |
| Sending data                   | 0.004350 |
| end                            | 0.000008 |
| removing tmp table             | 0.001207 |
| end                            | 0.000005 |
| removing tmp table             | 0.000040 |
| end                            | 0.000004 |
| query end                      | 0.000003 |
| freeing items                  | 0.000018 |
| removing tmp table             | 0.000005 |
| freeing items                  | 0.002768 |
| logging slow query             | 0.000004 |
| logging slow query             | 0.000029 |
| cleaning up                    | 0.000006 |
+--------------------------------+----------+
34 rows in set (0.00 sec)
4

1 回答 1

0

先尝试执行fsck再检查IO性能:dd if=/dev/zero of=/var/lib/mysql/file1 bs=512 count=10000

于 2013-04-16T10:33:17.853 回答