0

我们有一个带有 5 个节点和一个仲裁器的 Percona Xtradb 集群。我们的一位 PHP 开发人员在集群上运行了一个错误的查询,导致所有节点崩溃。崩溃后,我们无法收集任何错误日志来告诉我们真正出了什么问题,因为整个集群在没有执行任何日志记录的情况下崩溃了。

我一直认为,在集群上执行单个查询时,它只由集群中的一个节点处理。因此,如果查询是错误的(到了杀死数据库服务器的程度),它应该只使正在处理它的一个节点崩溃,而让集群与剩余的 4 个节点一起运行。

这种行为让我们感到困惑,我们想了解真正发生了什么,特别是这是第二次发生这种情况。为什么在由其中一个节点处理时在集群上运行的查询会导致集群中的其他节点在处理时出现问题时崩溃?

下面是我们的 my.cnf 配置:

#
# Default values.
[mysqld_safe]
flush_caches
numa_interleave
#
#
[mysqld]
back_log = 65535
binlog_format = ROW
character_set_server = utf8
collation_server = utf8_general_ci
datadir = /var/lib/mysql
default_storage_engine = InnoDB
expand_fast_index_creation = 1
expire_logs_days = 7
innodb_autoinc_lock_mode = 2
innodb_buffer_pool_instances = 16
innodb_buffer_pool_populate = 1
innodb_buffer_pool_size = 32G   # XXX 64GB RAM, 80%
innodb_data_file_path = ibdata1:64M;ibdata2:64M:autoextend
innodb_file_format = Barracuda
innodb_file_per_table
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
innodb_io_capacity = 1600
innodb_large_prefix
innodb_locks_unsafe_for_binlog = 1
innodb_log_file_size = 64M
innodb_print_all_deadlocks = 1
innodb_read_io_threads = 64
innodb_stats_on_metadata = FALSE
innodb_support_xa = FALSE
innodb_write_io_threads = 64
log-bin = mysqld-bin
log-queries-not-using-indexes
log-slave-updates
long_query_time = 1
max_allowed_packet = 64M
max_connect_errors = 4294967295
max_connections = 4096
min_examined_row_limit = 1000
port = 3306
relay-log-recovery = TRUE
skip-name-resolve
slow_query_log = 1
slow_query_log_timestamp_always = 1
table_open_cache = 4096
thread_cache = 1024
tmpdir = /db/tmp
transaction_isolation = REPEATABLE-READ
updatable_views_with_limit = 0
user = mysql
wait_timeout = 60
#
# Galera Variable config 
wsrep_cluster_address = gcomm://ip_1, ip_2, ip_3,ip_4,ip_4,ip_5
wsrep_cluster_name = cluster_db
wsrep_provider = /usr/lib/libgalera_smm.so
wsrep_provider_options = "gcache.size=4G"
wsrep_slave_threads = 32
wsrep_sst_auth = "user:password"
wsrep_sst_donor = "db1"
#wsrep_sst_method = xtrabackup_throttle
wsrep_sst_method = xtrabackup-v2
#
# XXX You *MUST* change!
server-id = 1
4

1 回答 1

0

您可以发布查询吗?SELECT 查询仅在单个节点上执行,但所有写入查询都将在任何地方执行。你的错误日志中有什么?

于 2015-08-21T19:16:10.167 回答