7

我目前正在使用具有 3 个节点的 Galera Cluster 在读/写拆分模式下测试 Maxscale。默认情况下,Maxscale 将一个节点定义为主节点,将另一个节点定义为从节点(我的配置是 100% 的从节点)。

我的目的是检查 Maxscale 如何处理节点关闭。

问题是使用基准测试(Sysbench、Mysqlslap)和自定义脚本(PHP),当我关闭集群的一个节点时,与后端(MariaDB)的连接会丢失。

错误日志:

MariaDB Corporation MaxScale    /var/log/maxscale/error1.log Thu Oct 29 13:00:11 2015
-----------------------------------------------------------------------
---     Logging is enabled.
2015-10-29 13:00:11   Error: Failed to obtain address for host ::1, Address family for hostname not supported
2015-10-29 13:00:11   Warning: Failed to add user root@::1 for service [RW Split Router]. This user will be unavailable via MaxScale.
2015-10-29 13:00:11   Warning: Duplicate MySQL user found for service [RW Split Router]: cmon@127.0.0.1 for database: (null)
2015-10-29 13:00:11   Warning: Duplicate MySQL user found for service [RW Split Router]: root@127.0.0.1 for database: (null)
2015-10-29 13:00:11   Warning: Duplicate MySQL user found for service [RW Split Router]: root@10.58.224.113 for database: (null)
2015-10-29 13:00:35   Error : Unable to write to backend due to authentication failure.
2015-10-29 13:00:40   Error : Monitor was unable to connect to server 10.58.224.113:3306 : "Can't connect to MySQL server on '10.58.224.113' (111)"

跟踪日志:

2015-10-29 13:00:33   [4]  Route query to slave         10.58.224.113:3306 <
2015-10-29 13:00:33   [4]  Servers and router connection counts:
2015-10-29 13:00:33   [4]  current operations : 0 in    10.58.224.113:3306 RUNNING SLAVE
2015-10-29 13:00:33   [4]  current operations : 0 in    10.26.116.84:3306 RUNNING SLAVE
2015-10-29 13:00:33   [4]  current operations : 0 in    10.26.84.103:3306 RUNNING MASTER
2015-10-29 13:00:33   [4]  Selected RUNNING SLAVE in    10.58.224.113:3306
2015-10-29 13:00:33   [4]  Selected RUNNING SLAVE in    10.26.116.84:3306
2015-10-29 13:00:33   [4]  Selected RUNNING MASTER in   10.26.84.103:3306
2015-10-29 13:00:34   [4]  > Autocommit: [enabled], trx is [not open], cmd: COM_QUERY, type: QUERY_TYPE_READ, stmt: SELECT COUNT(*) FROM sbtest1
2015-10-29 13:00:34   [4]  Route query to slave         10.58.224.113:3306 <
2015-10-29 13:00:36   [4]  Stopped RW Split Router client session [4]
2015-10-29 13:00:42   Server changed state: server1[10.58.224.113:3306]: slave_down

PHP测试脚本

<?php

# Test MaxScale

$db = new PDO('mysql:host=127.0.0.1;dbname=sbtest;charset=utf8;port=4446;', 'root', '***', array(PDO::ATTR_TIMEOUT => "10", PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION));
for($i=0; $i<5000; $i++)
{
    try{
            $q = $db->query('SELECT COUNT(*) FROM sbtest1', PDO::FETCH_NUM);
            if($q){
                    $res = $q->fetchAll();
                    #var_dump($res);
                    echo time()." Result: {$res[0][0]}\n";
                    sleep(1);
            }
    }
    catch(PDOException $Exception) {
            echo "PDOException: " . $Exception->getMessage() . "\n";
            die('forced script to stop');
    }
}

Mysqlslap 基准测试:

mysqlslap -h127.0.0.1 -uroot -p*** -P4446   --create="CREATE TABLE a (b int);INSERT INTO a VALUES (23)"  --query="SELECT * FROM a" --concurrency=50 --iterations=200 --delimiter=";"

Sysbench 基准测试:

sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua --oltp-table-size=2500 --mysql-user=root --mysql-password=*** --mysql-host=127.0.0.1 --db-ps-mode=disable --mysql-port=4446 prepare 

sysbench --num-threads=16 --max-requests=5000 --test=/usr/share/doc/sysbench/tests/db/oltp.lua --oltp-skip-trx=on --oltp-read-only=on --oltp-table-size=250000 --mysql-host=127.0.0.1  --mysql-user=root --mysql-password=*** --mysql-port=4446 run

遇到的错误:

PDOException: SQLSTATE[HY000]: General error: 2003 Authentication with backend failed. Session will be closed.
PDOException: SQLSTATE[HY000]: General error: 2006 MySQL server has gone away
PDOException: SQLSTATE[HY000]: General error: 2013 Lost connection to MySQL server during query

最大规模配置:

[maxscale]
threads=4
auth_connect_timeout=20
auth_read_timeout=20
auth_write_timeout=20
log_trace=1

[Galera Monitor]
type=monitor
module=galeramon
servers=server1,server2,server3
user=maxmon
passwd=***
monitor_interval=30000
backend_connect_timeout=10
backend_read_timeout=10
backend_write_timeout=10

[RW Split Router]
type=service
router=readwritesplit
servers=server2,server3,server1
user=root
passwd=***
max_slave_connections=100%
enable_root_user=1
router_options=slave_selection_criteria=LEAST_CURRENT_OPERATIONS

[Debug Interface]
type=service
router=debugcli

[CLI]
type=service
router=cli[Debug Interface]
type=service
router=debugcli

[CLI]
type=service
router=cli

[RW Split Listener]
type=listener
service=RW Split Router
protocol=MySQLClient
port=4446

[Debug Listener]
type=listener
service=Debug Interface
protocol=telnetd
address=127.0.0.1
port=4442

[CLI Listener]
type=listener
service=CLI
protocol=maxscaled
port=6603

[server1]
type=server
address=10.58.224.113
port=3306
protocol=MySQLBackend

[server2]
type=server
address=10.26.84.103
port=3306
protocol=MySQLBackend

[server3]
type=server
address=10.26.116.84
port=3306
protocol=MySQLBackend

会话监控显示会话变得无效,如下例所示:

# maxadmin -pmariadb show sessions

Session 9 (0x7f60a4000b50)
State:          Invalid State
Service:        RW Split Router (0x342f460)
Client DCB:     0x7f60a40009a0
Client Address:     root@127.0.0.1
Connected:      Thu Oct 29 13:28:57 2015

我还在 Maxscale 以及我的 PHP 测试脚本(PDO 超时)中使用了不同的超时变量和 monitor_interval,但问题似乎是 Maxscale 如何处理 MySQL 会话。

我还读到了 Maxscale 的乐观方式,它转发了从其中一个节点获得的最快响应,但不确定这是否是原因。

有没有办法使节点关闭对 Maxscale 传播到集群的所有从节点的任何 SQL 请求无害?

4

0 回答 0