4

我正在尝试在 Debian Wheezy 7.5 上设置 mariadb galera 集群。我发现了许多不同的指令,都有点不同,但到目前为止没有一个有效。

我正在尝试设置一个两节点集群。

在主节点上,我使用默认的 my.cnf,在 conf.d/cluster.cnf 中有这些附加设置:

[mysqld]
#mysql settings
bind-address=10.1.1.139
query_cache_size=0
query_cache_type=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
innodb_doublewrite=1

#galery settings
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://10.1.1.139,10.1.1.140"
wsrep_sst_method=rsync
wsrep_cluster_name="sql_cluster"
wsrep_node_incoming_address=10.1.1.139
wsrep_sst_receive_address=10.1.1.139
wsrep_sst_auth=cluster:password
wsrep_node_address='10.1.1.139'
wsrep_node_name='sql1'
wsrep_on=ON

创建集群用户,为该用户提供所有必需的权限,成功启动服务器

service mysql start --wsrep-new-cluster

集群启动,我可以看到cluster_size=1

在第二个节点上,我使用默认的 my.cnf,在 conf.d/cluster.cnf 中有这些附加设置:

[mysqld]
#mysql settings
bind-address=10.1.1.140
query_cache_size=0
query_cache_type=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
innodb_doublewrite=1

#galery settings
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://10.1.1.139,10.1.1.140"
wsrep_sst_method=rsync
wsrep_cluster_name="sql_cluster"
wsrep_node_incoming_address=10.1.1.140
wsrep_sst_receive_address=10.1.1.140
wsrep_sst_auth=cluster:password
wsrep_node_address='10.1.1.140'
wsrep_node_name='sql1'
wsrep_on=ON

根据以下建议,我还用主节点上的 debian.cnf 替换了辅助节点上的 debian.cnf:

http://docs.openstack.org/high-availability-guide/content/ha-aa-db-mysql-galera.html并授予适当的权限(这在其他地方也有建议,没有正确的链接现在)。

两个节点上的 debian.cnf 内容:

[client]
host = localhost
user = debian-sys-maint
password = <password>
socket = /var/run/mysqld/mysqld.sock
[mysql_upgrade]
host = localhost
user = debian-sys-maint
password = <password>
socket = /var/run/mysqld/mysqld.sock
basedir = /usr

当我尝试使用以下命令启动第二个节点时:

service mysql start

它失败了,我在 /var/log/syslog 中得到了这个:

May  7 19:45:30 ns514282 mysqld_safe: Starting mysqld daemon with databases from /var/lib/mysql
May  7 19:45:30 ns514282 mysqld_safe: WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.s6Uwyc' --pid-file='/var/lib/mysql/ns514282.ip-167-114-159.net-recover.pid'
May  7 19:45:33 ns514282 mysqld_safe: WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: wsrep_start_position var submitted: '00000000-0000-0000-0000-000000000000:-1'
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: Read nil XID from storage engines, skipping position init
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: wsrep_load(): Galera 3.9(rXXXX) by Codership Oy <info@codership.com> loaded successfully.
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: CRC-32C: using hardware acceleration.
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: Passing config to GCS: base_host = 10.1.1.142; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npv
May  7 19:45:33 ns514282 mysqld: o = false; pc.recovery 
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: Service thread queue flushed.
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: wsrep_sst_grab()
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: Start replication
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: protonet asio version 0
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: Using CRC-32C for message checksums.
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: backend: asio
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: restore pc from disk successfully
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: GMCast version 0
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: (66b559a2, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: (66b559a2, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: EVS version 0
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: gcomm: connecting to group 'bfm_cluster', peer '10.1.1.141:,10.1.1.142:'
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Warning] WSREP: (66b559a2, 'tcp://0.0.0.0:4567') address 'tcp://10.1.1.142:4567' points to own listening address, blacklisting
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: (66b559a2, 'tcp://0.0.0.0:4567') address 'tcp://10.1.1.142:4567' pointing to uuid 66b559a2 is blacklisted, skipping
May  7 19:45:33 ns514282 mysqld: 150507 19:45:33 [Note] WSREP: (66b559a2, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: 
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: declaring dc2b490d at tcp://10.1.1.141:4567 stable
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: re-bootstrapping prim from partitioned components
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: view(view_id(PRIM,66b559a2,12) memb {
May  7 19:45:34 ns514282 mysqld: #01166b559a2,0
May  7 19:45:34 ns514282 mysqld: #011dc2b490d,0
May  7 19:45:34 ns514282 mysqld: } joined {
May  7 19:45:34 ns514282 mysqld: } left {
May  7 19:45:34 ns514282 mysqld: } partitioned {
May  7 19:45:34 ns514282 mysqld: })
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: save pc into disk
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: clear restored view
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: gcomm: connected
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: Opened channel 'bfm_cluster'
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: Waiting for SST to complete.
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 279db665-f513-11e4-9149-aa318d13ebc4
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: STATE EXCHANGE: sent state msg: 279db665-f513-11e4-9149-aa318d13ebc4
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: STATE EXCHANGE: got state msg: 279db665-f513-11e4-9149-aa318d13ebc4 from 0 (sql1)
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: STATE EXCHANGE: got state msg: 279db665-f513-11e4-9149-aa318d13ebc4 from 1 (sql3)
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Warning] WSREP: Quorum: No node with complete state:
May  7 19:45:34 ns514282 mysqld: 
May  7 19:45:34 ns514282 mysqld: 
May  7 19:45:34 ns514282 mysqld: #011Version      : 3
May  7 19:45:34 ns514282 mysqld: #011Flags        : 0x1
May  7 19:45:34 ns514282 mysqld: #011Protocols    : 0 / 7 / 3
May  7 19:45:34 ns514282 mysqld: #011State        : NON-PRIMARY
May  7 19:45:34 ns514282 mysqld: #011Prim state   : NON-PRIMARY
May  7 19:45:34 ns514282 mysqld: #011Prim UUID    : 00000000-0000-0000-0000-000000000000
May  7 19:45:34 ns514282 mysqld: #011Prim  seqno  : -1
May  7 19:45:34 ns514282 mysqld: #011First seqno  : -1
May  7 19:45:34 ns514282 mysqld: #011Last  seqno  : -1
May  7 19:45:34 ns514282 mysqld: #011Prim JOINED  : 0
May  7 19:45:34 ns514282 mysqld: #011State UUID   : 279db665-f513-11e4-9149-aa318d13ebc4
May  7 19:45:34 ns514282 mysqld: #011Group UUID   : 00000000-0000-0000-0000-000000000000
May  7 19:45:34 ns514282 mysqld: #011Name         : 'sql1'
May  7 19:45:34 ns514282 mysqld: #011Incoming addr: '10.1.1.142:3306'
May  7 19:45:34 ns514282 mysqld: 
May  7 19:45:34 ns514282 mysqld: #011Version      : 3
May  7 19:45:34 ns514282 mysqld: #011Flags        : 0x2
May  7 19:45:34 ns514282 mysqld: #011Protocols    : 0 / 7 / 3
May  7 19:45:34 ns514282 mysqld: #011State        : NON-PRIMARY
May  7 19:45:34 ns514282 mysqld: #011Prim state   : SYNCED
May  7 19:45:34 ns514282 mysqld: #011Prim UUID    : b65a0277-f50f-11e4-a916-dbeff5b65a2e
May  7 19:45:34 ns514282 mysqld: #011Prim  seqno  : 8
May  7 19:45:34 ns514282 mysqld: #011First seqno  : -1
May  7 19:45:34 ns514282 mysqld: #011Last  seqno  : 0
May  7 19:45:34 ns514282 mysqld: #011Prim JOINED  : 1
May  7 19:45:34 ns514282 mysqld: #011State UUID   : 279db665-f513-11e4-9149-aa318d13ebc4
May  7 19:45:34 ns514282 mysqld: #011Group UUID   : dc2be55b-f506-11e4-8748-4bd7f3fc795c
May  7 19:45:34 ns514282 mysqld: #011Name         : 'sql3'
May  7 19:45:34 ns514282 mysqld: #011Incoming addr: '10.1.1.141:3306'
May  7 19:45:34 ns514282 mysqld: 
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: Full re-merge of primary b65a0277-f50f-11e4-a916-dbeff5b65a2e found: 1 of 1.
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: Quorum results:
May  7 19:45:34 ns514282 mysqld: #011version    = 3,
May  7 19:45:34 ns514282 mysqld: #011component  = PRIMARY,
May  7 19:45:34 ns514282 mysqld: #011conf_id    = 8,
May  7 19:45:34 ns514282 mysqld: #011members    = 1/2 (joined/total),
May  7 19:45:34 ns514282 mysqld: #011act_id     = 0,
May  7 19:45:34 ns514282 mysqld: #011last_appl. = -1,
May  7 19:45:34 ns514282 mysqld: #011protocols  = 0/7/3 (gcs/repl/appl),
May  7 19:45:34 ns514282 mysqld: #011group UUID = dc2be55b-f506-11e4-8748-4bd7f3fc795c
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: Flow-control interval: [23, 23]
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 0)
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: State transfer required: 
May  7 19:45:34 ns514282 mysqld: #011Group state: dc2be55b-f506-11e4-8748-4bd7f3fc795c:0
May  7 19:45:34 ns514282 mysqld: #011Local state: 00000000-0000-0000-0000-000000000000:-1
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: New cluster view: global state: dc2be55b-f506-11e4-8748-4bd7f3fc795c:0, view# 9: Primary, number of nodes: 2, my index: 0, protocol version 3
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Warning] WSREP: Gap in state sequence. Need state transfer.
May  7 19:45:34 ns514282 mysqld: 150507 19:45:34 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '10.1.1.142' --auth 'cluster:password' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '12278' --binlog '/var/log/mysql/mariadb-bin' '
May  7 19:45:34 ns514282 rsyncd[12428]: rsyncd version 3.0.9 starting, listening on port 4444
May  7 19:45:37 ns514282 mysqld: 150507 19:45:37 [Note] WSREP: (66b559a2, 'tcp://0.0.0.0:4567') turning message relay requesting off
May  7 19:45:47 ns514282 /usr/sbin/irqbalance: Load average increasing, re-enabling all cpus for irq balancing
May  7 19:45:57 ns514282 /usr/sbin/irqbalance: Load average increasing, re-enabling all cpus for irq balancing
May  7 19:46:02 ns514282 /USR/SBIN/CRON[16491]: (root) CMD (/usr/local/rtm/bin/rtm 50 > /dev/null 2> /dev/null)
May  7 19:46:03 ns514282 /etc/init.d/mysql[16711]: 0 processes alive and '/usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf ping' resulted in
May  7 19:46:03 ns514282 /etc/init.d/mysql[16711]: #007/usr/bin/mysqladmin: connect to server at 'localhost' failed
May  7 19:46:03 ns514282 /etc/init.d/mysql[16711]: error: 'Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111 "Connection refused")'
May  7 19:46:03 ns514282 /etc/init.d/mysql[16711]: Check that mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists!

这个问题在互联网上有无数的线索。有些没有答案。一些确实有答案的人

ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111) - 我的磁盘空间未满

无法通过套接字 '/var/run/mysqld/mysqld.sock' 连接到本地 MySQL 服务器- 没有答案。但根据评论, mysql.sock 确实存在并且拥有 mysql.mysql 所有权。

ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) - 服务器已安装,socket 再次出现在正确的位置

我还读到这可能是 /var/run/mysqld 上的权限问题,但我已经检查了这个并给了它 mysql.mysql 所有权。

如果不出意外,这是试图重振这个问题的尝试。任何方向都非常感谢。

谢谢,

更新: 两个节点的 my.cnf。它是默认的 my.cnf。唯一的变化是注释掉该bind-address=127.0.0.1行。

[client]
port        = 3306
socket      = /var/run/mysqld/mysqld.sock

[mysqld_safe]
socket      = /var/run/mysqld/mysqld.sock
nice        = 0

[mysqld]
user        = mysql
pid-file    = /var/run/mysqld/mysqld.pid
socket      = /var/run/mysqld/mysqld.sock
port        = 3306
basedir     = /usr
datadir     = /var/lib/mysql
tmpdir      = /tmp
lc_messages_dir = /usr/share/mysql
lc_messages = en_US
skip-external-locking

# bind-address      = 127.0.0.1

max_connections     = 100
connect_timeout     = 5
wait_timeout        = 600
max_allowed_packet  = 16M
thread_cache_size       = 128
sort_buffer_size    = 4M
bulk_insert_buffer_size = 16M
tmp_table_size      = 32M
max_heap_table_size = 32M

myisam_recover          = BACKUP
key_buffer_size     = 128M
table_open_cache    = 400
myisam_sort_buffer_size = 512M
concurrent_insert   = 2
read_buffer_size    = 2M
read_rnd_buffer_size    = 1M

query_cache_limit       = 128K
query_cache_size        = 64M

log_warnings        = 2

slow_query_log_file = /var/log/mysql/mariadb-slow.log
long_query_time = 10
log_slow_verbosity  = query_plan

log_bin         = /var/log/mysql/mariadb-bin
log_bin_index       = /var/log/mysql/mariadb-bin.index
expire_logs_days    = 10
max_binlog_size         = 100M

default_storage_engine  = InnoDB

innodb_buffer_pool_size = 256M
innodb_log_buffer_size  = 8M
innodb_file_per_table   = 1
innodb_open_files   = 400
innodb_io_capacity  = 400
innodb_flush_method = O_DIRECT

[mysqldump]
quick
quote-names
max_allowed_packet  = 16M

[mysql]

[isamchk]
key_buffer      = 16M

!includedir /etc/mysql/conf.d/

更新 另外,我测试了,如果我尝试自行定期启动节点,(没有集群,没有额外的设置,只是默认值)它可以工作。

4

2 回答 2

0

I recommend restarting the server.

if you keep having problems ... backup, uninstall and install again ...

with me it worked like this ...

Verify that all IPS used are BIND compliant

于 2020-01-05T23:30:38.257 回答
0

wsrep_node_name ”是否正确?

[mysqld]
#mysql settings
bind-address=10.1.1.140
query_cache_size=0
query_cache_type=0
binlog_format=ROW 
default_storage_engine=innodb 
innodb_autoinc_lock_mode=2 
innodb_doublewrite=1

#galery settings
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://10.1.1.139,10.1.1.140"
wsrep_sst_method=rsync 
wsrep_cluster_name="sql_cluster" 
wsrep_node_incoming_address=10.1.1.140 
wsrep_sst_receive_address=10.1.1.140 
wsrep_sst_auth=cluster:password 
wsrep_node_address='10.1.1.140' 
wsrep_node_name='sql1' <== ???
wsrep_on=ON
于 2020-01-06T00:20:13.937 回答