0

我有一个具有两个节点的高可用性集群,其中一个资源用于 drbd、一个虚拟 IP 和在 drbd 分区上共享的 mariaDB 文件。

一切似乎都正常,但 drbd 没有同步我创建的最新文件,即使 drbd 状态告诉我它们是 UpToDate。

sudo drbdadm status 
iba role:Primary
  disk:UpToDate

个人电脑也没有显示错误

sudo pcs status 
Cluster name: cluster_iba
Cluster Summary:
  * Stack: corosync
  * Current DC: iba2-ip192 (version 2.0.3-4b1f869f0f) - partition with quorum
  * Last updated: Tue Feb 22 18:16:20 2022
  * Last change:  Mon Feb 21 16:19:38 2022 by root via cibadmin on iba1-ip192
  * 2 nodes configured
  * 6 resource instances configured

Node List:
  * Online: [ iba1-ip192 iba2-ip192 ]

Full List of Resources:
  * virtual_ip  (ocf::heartbeat:IPaddr2):    Started iba2-ip192
  * Clone Set: DrbdData-clone [DrbdData] (promotable):
    * Masters: [ iba2-ip192 ]
    * Slaves: [ iba1-ip192 ]
  * DrbdFS  (ocf::heartbeat:Filesystem):     Started iba2-ip192
  * WebServer   (ocf::heartbeat:apache):     Started iba2-ip192
  * Maria   (ocf::heartbeat:mysql):  Started iba2-ip192

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

所有约束:

sudo pcs constraint list --full
Location Constraints:
Ordering Constraints:
  promote DrbdData-clone then start DrbdFS (kind:Mandatory) (id:order-DrbdData-clone-DrbdFS-mandatory)
  start DrbdFS then start virtual_ip (kind:Mandatory) (id:order-DrbdFS-virtual_ip-mandatory)
  start virtual_ip then start WebServer (kind:Mandatory) (id:order-virtual_ip-WebServer-mandatory)
  start DrbdFS then start Maria (kind:Mandatory) (id:order-DrbdFS-Maria-mandatory)
Colocation Constraints:
  DrbdFS with DrbdData-clone (score:INFINITY) (with-rsc-role:Master) (id:colocation-DrbdFS-DrbdData-clone-INFINITY)
  virtual_ip with DrbdFS (score:INFINITY) (id:colocation-virtual_ip-DrbdFS-INFINITY)
  WebServer with virtual_ip (score:INFINITY) (id:colocation-WebServer-virtual_ip-INFINITY)
  Maria with DrbdFS (score:INFINITY) (id:colocation-Maria-DrbdFS-INFINITY)
Ticket Constraints:

节点 iba2-ip192 中 /mnt/datosDRBD 中的文件(当它是主节点时),

/mnt/datosDRBD$ ls -l
total 80
-rw-r--r-- 1 root  root   5801 feb 21 12:16 drbd_cfg
-rw-r--r-- 1 root  root  10494 feb 21 12:18 fs_cfg
drwx------ 2 root  root  16384 feb 21 10:12 lost+found
drwxr-xr-x 4 mysql mysql  4096 feb 22 18:00 mariaDB
-rw-r--r-- 1 root  root  17942 feb 21 12:39 MariaDB_cfg
-rw-r--r-- 1 root  root      5 feb 21 10:13 testMParicio.txt
-rw-r--r-- 1 root  root  13578 feb 21 12:21 WebServer_cfg

以及节点 iba1-ip192 中 /mnt/datosDRBD 中的文件(当它是主节点时),

ls -l
total 92
-rw-r--r-- 1 root     root      5801 feb 21 12:16 drbd_cfg
drwxrwxrwx 5 www-data www-data  4096 feb 22 13:41 FilesSGITV
-rw-r--r-- 1 root     root     10494 feb 21 12:18 fs_cfg
drwx------ 2 root     root     16384 feb 21 10:12 lost+found
drwxr-xr-x 7 mysql    mysql     4096 feb 22 17:55 mariaDB
-rw-r--r-- 1 root     root     17942 feb 21 12:39 MariaDB_cfg
-rw-r--r-- 1 root     root         5 feb 22 17:58 testMParicio2.txt
-rw-r--r-- 1 www-data www-data     9 feb 22 17:58 testMParicio3.txt
-rw-r--r-- 1 root     root         5 feb 21 10:13 testMParicio.txt
-rw-r--r-- 1 root     root     13578 feb 21 12:21 WebServer_cfg

所有新文件 testMParicio2.txt testMParicio3.txt 和文件夹 FilesSGITV 都丢失了。

我不知道该怎么办。我很失落。

我感谢任何帮助,谢谢。

(编辑)

我对 drbd 的配置,在两个节点中......

cat /etc/drbd.conf 
# You can find an example in  /usr/share/doc/drbd.../drbd.conf.example

include "drbd.d/global_common.conf";
include "drbd.d/*.res";

我的 *.res 配置也在两个节点中:

resource iba {
        device /dev/drbd0;
        disk /dev/md3;
                meta-disk internal;
                on iba1 {
                        address 10.0.0.248:7789;
                }
                on iba2  {
                        address 10.0.0.249:7789;
                }
}

drbdadm 使用 iba1 和 iba2,IP 为 10.0.0.248 和 10.0.0.249

Corosync 使用 iba1-ip192 和 iba2-192,IP 为 192.168.1.248 和 192.168.1.249

cat /etc/hosts
127.0.0.1 localhost
#127.0.1.1 iba1
10.0.0.248  iba1
10.0.0.249  iba2
192.168.1.248 iba1-ip192
192.168.1.249 iba2-ip192
cat /etc/drbd.d/global_common.conf


global {
    usage-count yes;
    
    udev-always-use-vnr; # treat implicit the same as explicit volumes

}

common {
    handlers {
    }

    startup {
    }

    options {
    }

    disk {
    }

    net {
        protocol C;
    }
}

(编辑 2)

我在 /proc/drbd 中发现了一个问题

在主节点中:

cat /proc/drbd 
version: 8.4.11 (api:1/proto:86-101)
srcversion: FC3433D849E3B88C1E7B55C 
 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
    ns:0 nr:0 dw:2284 dr:11625 al:6 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:42364728

在辅助节点

cat /proc/drbd 
version: 8.4.11 (api:1/proto:86-101)
srcversion: FC3433D849E3B88C1E7B55C 
 0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:36538580

辅助节点不记得 ssh 密钥,修复

ssh-keygen  -R 10.0.0.248
ssh-copy-id iba@iba1

但 drbd 仍处于 StandAlone 状态。
我不知道如何继续

4

1 回答 1

0

我发现了一个没有出现在 pcs 状态中的 Split-Brain。

sudo journalctl | grep Split-Brain
feb 21 13:00:10 ibatec1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
feb 21 13:21:40 ibatec1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
feb 21 13:27:54 ibatec1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!

我已经停止了集群,在 master 上使用了 --force,然后......在裂脑受害者上(假设 DRBD 资源是 iba):

drbdadm disconnect iba
drbdadm secondary iba
drbdadm connect --discard-my-data iba

关于裂脑幸存者:

drbdadm primary iba
drbdadm connect iba
于 2022-02-23T09:15:46.143 回答