两天前,我打算使用PRM为 MySQL Master 设置故障转移,然后……每个问题都从这里开始。
启动后corosync,MySQL 挂起,socket 文件丢失,我的第一个错误是做了一个kill -9.
无法恢复旧主人innodb_force_recovery = 4,我提升了奴隶成为主人,但它遇到了同样的问题。
这是我的配置。
[mysqld]
# Settings user and group are ignored when systemd is used (fedora >= 15).
# If you need to run mysqld under different user or group, 
# customize your systemd unit file for mysqld according to the
# instructions in http://fedoraproject.org/wiki/Systemd
user=mysql
datadir=/data2/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
## Files
back_log            = 300
open-files-limit    = 8192
open-files          = 1024  
skip-external-locking
skip-name-resolve
## Logging
relay-log           = mysqld-relay-bin
relay-log-index     = mysqld-relay-bin.index
#log                = mysql-gen.log 
log_warnings
log_bin             = mysql-bin
#log_queries_not_using_indexes
max_binlog_size         = 256M  #max size for binlog before rolling
expire_logs_days        = 4 #binlog files older than this will be purged
## Per-Thread Buffers * (max_connections) = total per-thread mem usage
thread_stack            = 256K    #default: 32bit: 192K, 64bit: 256K
sort_buffer_size        = 1M      #default: 2M, larger may cause perf issues
read_buffer_size        = 1M      #default: 128K, change in increments of 4K
read_rnd_buffer_size    = 1M      #default: 256K                
join_buffer_size        = 1M      #default: 128K
binlog_cache_size       = 64K     #default: 32K, size of buffer to hold TX queries
## total per-thread buffer memory usage: 8832000K = 8.625GB
## Query Cache
query_cache_size        = 32M   #global buffer
query_cache_limit       = 512K  #max query result size to put in cache
## Connections
max_connections         = 2000  #multiplier for memory usage via per-thread buffers
max_connect_errors      = 100   #default: 10
concurrent_insert       = 2 #default: 1, 2: enable insert for all instances
connect_timeout         = 30    #default -5.1.22: 5, +5.1.22: 10
max_allowed_packet      = 32M   #max size of incoming data to allow
## Default Table Settings
sql_mode            = NO_AUTO_CREATE_USER
## Table and TMP settings
max_heap_table_size         = 1G    #recommend same size as tmp_table_size
bulk_insert_buffer_size     = 1G    #recommend same size as tmp_table_size
tmp_table_size                  = 1G    #recommend 1G min
#tmpdir                         = /data/mysql-tmp0:/data/mysql-tmp1 #Recommend using RAMDISK for tmpdir
## Table cache settings
table_cache             = 512   #5.0.x <default: 64>
table_open_cache        = 512   #5.1.x, 5.5.x <default: 64>
## Thread settings
thread_concurrency      = 16  #recommend 2x CPU cores
thread_cache_size       = 100 #recommend 5% of max_connections
## MyISAM Engine
key_buffer          = 1M    #global buffer
myisam_sort_buffer_size     = 128M  #index buffer size for creating/altering indexes
myisam_max_sort_file_size   = 256M  #max file size for tmp table when creating/alering indexes
myisam_repair_threads       = 4 #thread quantity when running repairs
myisam_recover          = BACKUP    #repair mode, recommend BACKUP 
## InnoDB Plugin Independent Settings
#innodb_data_home_dir            = /data2/var/lib/mysql
#innodb_data_file_path      = ibdata1:128M;ibdata2:10M:autoextend
#innodb_log_file_size       = 512M  #64G_RAM+ = 768, 24G_RAM+ = 512, 8G_RAM+ = 256, 2G_RAM+ = 128 
#innodb_log_files_in_group  = 4 #combined size of all logs <4GB. <2G_RAM = 2, >2G_RAM = 4
innodb_buffer_pool_size     = 32G   #global buffer
innodb_additional_mem_pool_size = 4M    #global buffer
innodb_status_file          #extra reporting
innodb_file_per_table           #enable always
innodb_flush_log_at_trx_commit  = 2 #2/0 = perf, 1 = ACID
innodb_table_locks      = 0 #preserve table locks
innodb_log_buffer_size      = 128M  #global buffer
innodb_lock_wait_timeout    = 60    
innodb_thread_concurrency   = 16    #recommend 2x core quantity
innodb_commit_concurrency   = 16    #recommend 4x num disks
#innodb_flush_method        = O_DIRECT     #O_DIRECT = local/DAS, O_DSYNC = SAN/iSCSI
innodb_support_xa       = 0        #recommend 0, disable xa to negate extra disk flush
skip-innodb-doublewrite
## Binlog sync settings
## XA transactions = 1, otherwise set to 0 for best performance
sync_binlog         = 1
## TX Isolation
transaction-isolation       = REPEATABLE-READ #REPEATABLE-READ req for ACID, SERIALIZABLE req XA
## Per-Thread Buffer memory utilization equation:
#(read_buffer_size + read_rnd_buffer_size + sort_buffer_size + thread_stack + join_buffer_size + binlog_cache_size) * max_connections
## Global Buffer memory utilization equation:
# innodb_buffer_pool_size + innodb_additional_mem_pool_size + innodb_log_buffer_size + key_buffer_size + query_cache_size
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
# Semisynchronous Replication
# http://dev.mysql.com/doc/refman/5.5/en/replication-semisync.html
# uncomment next line on MASTER
;plugin-load=rpl_semi_sync_master=semisync_master.so
# uncomment next line on SLAVE
;plugin-load=rpl_semi_sync_slave=semisync_slave.so
# Others options for Semisynchronous Replication
;rpl_semi_sync_master_enabled=1
;rpl_semi_sync_master_timeout=10
;rpl_semi_sync_slave_enabled=1
# http://dev.mysql.com/doc/refman/5.5/en/performance-schema.html
;performance_schema
thread_cache_size = 4
query_cache_type = 1
slow-query-log
long_query_time         = 10
slow_query_log_file     = /var/log/mysql/mysql-slow.log
log-warnings            = 2
# slave config
skip-slave-start
server-id = 22
#read_only = 1
skip-name-resolve
log-bin=/data2/var/log/mysql/mysql-bin
relay_log_purge=0
binlog-format=MIXED
#expire_logs_days = 31
log-bin-trust-function-creators = 1
slave-skip-errors = 1062,1146,1032
replicate-wild-ignore-table=%.norep%
#testing
table_definition_cache = 5000
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
这台机器是在它的master遇到这个问题后才升级为master的(因为我不知道为什么,我干脆把它关掉了)。
我正在使用mysql-server-5.5.23-1.el5.remi版本。
这是 MySQL 日志文件。
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Last MySQL binlog file position 0 795, file name /data/var/log/mysql/mysql-bin.000801
120719 10:25:47  InnoDB: Waiting for the background threads to start
120719 10:25:48 InnoDB: 1.1.8 started; log sequence number 8068861682758
120719 10:25:48 [Note] Recovering after a crash using /data/var/log/mysql/mysql-bin
120719 10:25:48 [Note] Starting crash recovery...
120719 10:25:48 [Note] Crash recovery finished.
120719 10:25:49 [Warning] 'proxies_priv' entry '@ root@svr201NTC-647.localdomain' ignored in --skip-name-resolve mode.
120719 10:25:49 [Note] Event Scheduler: Loaded 27 events
120719 10:25:49 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.5.23-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL) by Remi
120719 10:25:56 [Warning] Aborted connection 23 to db: 'reportingdb' user: 'adproject' host: '192.168.3.87' (Got an error reading communication packets)
120719 10:25:57 [Warning] Aborted connection 44 to db: 'reportingdb' user: 'adproject' host: '192.168.3.87' (Got an error reading communication packets)
120719 10:25:57 [Warning] Aborted connection 1 to db: 'reportingdb' user: 'adproject' host: '192.168.5.192' (Got an error reading communication packets)
03:25:57 UTC - mysqld got signal 8 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=19
max_threads=300
thread_count=16
connection_count=16
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 664409 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x1286fcd0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 49a340b8 thread_stack 0x40000
/usr/libexec/mysqld(my_print_stacktrace+0x2e)[0x7ae02e]
/usr/libexec/mysqld(handle_fatal_signal+0x3e2)[0x679da2]
/lib64/libpthread.so.0[0x3154a0ebe0]
/usr/libexec/mysqld(_ZN12ha_partition21min_rows_for_estimateEv+0x4f)[0x9223af]
[0x49a2bf60]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (11a7ad38): SELECT * FROM (
    SELECT B.bannerid, SUM(realclick)clicks, SUM(totalview)views,   
    ROUND((MAX(uv) + (SUM(uv) - MAX(uv))*0.1)*1.1)`uviews`  ,
        CAST(IF(SUM(realclick)=0 OR SUM(totalview)=0,'N/A',CONCAT(ROUND(SUM(realclick)*100/(SUM(totalview)),2))) as CHAR) `CTR`,
          SUM(A.`money`) money
    FROM `ox_banners` B
    INNER JOIN  `v3_ban_date_cpm7k` AS A   ON B.`campaignid` = A.`campaignid` AND B.`bannerid` = A.`bannerid`
    WHERE A.`bannerid` = NAME_CONST('_bannerid',144660) AND A.`campaignid` >0
    AND A.`dt` BETWEEN  NAME_CONST('_start',_latin1'2012-08-20' COLLATE 'latin1_swedish_ci') AND  NAME_CONST('_end',_latin1'2012-08-26' COLLATE 'latin1_swedish_ci')) A
Connection ID (thread ID): 47
Status: NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
每当有查询访问表中不存在的分区时,听起来 MySQL 就会崩溃(注意上述查询中的2012-08-20and )。2012-08-26令我惊讶的是它以前没有发生过。
我有很多可用的 RAM。
free -m
             total       used       free     shared    buffers     cached
Mem:         48289      31600      16688          0        666      23848
-/+ buffers/cache:       7085      41203
Swap:         8189         54       8135
数据目录文件权限。
drwxr-xr-x 48 mysql mysql 118784 Jul 18 16:20 /data2/var/lib/mysql
和mysqld.log。
(-rw-r----- 1 mysql mysql 36M Jul 18 17:14 /var/log/mysqld.log)
那么我怎样才能找出导致我的表崩溃的原因以及如何修复它呢?
7 月 19 日:更新日志。
我一直在尝试按照本指南进行调试,但没有帮助:
# resolve_stack_dump -s /tmp/mysqld.sym -n mysqld.stack | c++filt 
0x7ae02e my_print_stacktrace + 46
0x679da2 handle_fatal_signal + 994
0x3fed00ebe0 _end + -335143280
0x9223af ha_partition::min_rows_for_estimate() + 79
0x417c2ff0 _end + 1082221216
# addr2line -fie /usr/libexec/mysqld 0x2e
??
??:0
# addr2line -fie /usr/libexec/mysqld 0x3e2
??
??:0
# addr2line -fie /usr/libexec/mysqld 0x4f
??
??:0