我最近使用 DRBD 作为后端部署了 hanfs,在我的情况下,我正在运行主动/备用模式,但我也已经在主/主模式下使用 OCFS2 成功测试了它。不幸的是,关于如何最好地实现这一点的文档并不多,存在的大多数文档充其量也几乎没有用处。如果您确实选择了 drbd 路线,我强烈建议您加入 drbd 邮件列表,并阅读所有文档。这是我为处理 ha 的故障而编写的 ha/drbd 设置和脚本:
需要 DRBD8 - 这是由 drbd8-utils 和 drbd8-source 提供的。一旦安装了这些(我相信它们是由 backports 提供的),您可以使用模块助手来安装它 - ma i drbd8。此时可以使用 depmod -a 或重新启动,如果您使用 depmod -a,则需要 modprobe drbd。
您将需要一个后端分区来用于 drbd,不要将此分区设为 LVM,否则您会遇到各种问题。不要将 LVM 放在 drbd 设备上,否则会遇到各种问题。
汉弗斯1:
/etc/drbd.conf
global {
usage-count no;
}
common {
protocol C;
disk { on-io-error detach; }
}
resource export {
syncer {
rate 125M;
}
on hanfs2 {
address 172.20.1.218:7789;
device /dev/drbd1;
disk /dev/sda3;
meta-disk internal;
}
on hanfs1 {
address 172.20.1.219:7789;
device /dev/drbd1;
disk /dev/sda3;
meta-disk internal;
}
}
Hanfs2 的 /etc/drbd.conf:
global {
usage-count no;
}
common {
protocol C;
disk { on-io-error detach; }
}
resource export {
syncer {
rate 125M;
}
on hanfs2 {
address 172.20.1.218:7789;
device /dev/drbd1;
disk /dev/sda3;
meta-disk internal;
}
on hanfs1 {
address 172.20.1.219:7789;
device /dev/drbd1;
disk /dev/sda3;
meta-disk internal;
}
}
配置完成后,接下来我们需要启动 drbd。
drbdadm create-md 导出
drbdadm 附加导出
drbdadm 连接导出
我们现在必须执行数据的初始同步 - 显然,如果这是一个全新的 drbd 集群,那么您选择哪个节点都没有关系。
完成后,您需要在您的 drbd 设备上创建 mkfs.yourchoiceofilesystem - 我们上面的配置中的设备是 /dev/drbd1。http://www.drbd.org/users-guide/p-work.html是使用 drbd 时阅读的有用文档。
心跳
安装心跳2。(很简单,apt-get install heartbeat2)。
每台机器上的 /etc/ha.d/ha.cf 应包括:
hanfs1:
logfacility local0
keepalive 2
warntime 10
deadtime 30
initdead 120
ucast eth1 172.20.1.218
auto_failback no
node hanfs1
node hanfs2
hanfs2:
logfacility local0
keepalive 2
warntime 10
deadtime 30
initdead 120
ucast eth1 172.20.1.219
auto_failback no
node hanfs1
node hanfs2
/etc/ha.d/haresources 在两个 ha 盒子上应该是相同的:
hanfs1 IPaddr::172.20.1.230/24/eth1
hanfs1 HeartBeatWrapper
我编写了一个包装脚本来处理故障转移场景中由 nfs 和 drbd 引起的特性。该脚本应该存在于每台机器上的 /etc/ha.d/resources.d/ 中。
!/bin/bash
heartbeat fails hard.
so this is a wrapper
to get around that stupidity
I'm just wrapping the heartbeat scripts, except for in the case of umount
as they work, mostly
if [[ -e /tmp/heartbeatwrapper ]]; then
runningpid=$(cat /tmp/heartbeatwrapper)
if [[ -z $(ps --no-heading -p $runningpid) ]]; then
echo "PID found, but process seems dead. Continuing."
else
echo "PID found, process is alive, exiting."
exit 7
fi
fi
echo $$ > /tmp/heartbeatwrapper
if [[ x$1 == "xstop" ]]; then
/etc/init.d/nfs-kernel-server stop #>/dev/null 2>&1
NFS init script isn't LSB compatible, exit codes are 0 no matter what happens.
Thanks guys, you really make my day with this bullshit.
Because of the above, we just have to hope that nfs actually catches the signal
to exit, and manages to shut down its connections.
If it doesn't, we'll kill it later, then term any other nfs stuff afterwards.
I found this to be an interesting insight into just how badly NFS is written.
sleep 1
#we don't want to shutdown nfs first!
#The lock files might go away, which would be bad.
#The above seems to not matter much, the only thing I've determined
#is that if you have anything mounted synchronously, it's going to break
#no matter what I do. Basically, sync == screwed; in NFSv3 terms.
#End result of failing over while a client that's synchronous is that
#the client hangs waiting for its nfs server to come back - thing doesn't
#even bother to time out, or attempt a reconnect.
#async works as expected - it insta-reconnects as soon as a connection seems
#to be unstable, and continues to write data. In all tests, md5sums have
#remained the same with/without failover during transfer.
#So, we first unmount /export - this prevents drbd from having a shit-fit
#when we attempt to turn this node secondary.
#That's a lie too, to some degree. LVM is entirely to blame for why DRBD
#was refusing to unmount. Don't get me wrong, having /export mounted doesn't
#help either, but still.
#fix a usecase where one or other are unmounted already, which causes us to terminate early.
if [[ "$(grep -o /varlibnfs/rpc_pipefs /etc/mtab)" ]]; then
for ((test=1; test <= 10; test++)); do
umount /export/varlibnfs/rpc_pipefs >/dev/null 2>&1
if [[ -z $(grep -o /varlibnfs/rpc_pipefs /etc/mtab) ]]; then
break
fi
if [[ $? -ne 0 ]]; then
#try again, harder this time
umount -l /var/lib/nfs/rpc_pipefs >/dev/null 2>&1
if [[ -z $(grep -o /varlibnfs/rpc_pipefs /etc/mtab) ]]; then
break
fi
fi
done
if [[ $test -eq 10 ]]; then
rm -f /tmp/heartbeatwrapper
echo "Problem unmounting rpc_pipefs"
exit 1
fi
fi
if [[ "$(grep -o /dev/drbd1 /etc/mtab)" ]]; then
for ((test=1; test <= 10; test++)); do
umount /export >/dev/null 2>&1
if [[ -z $(grep -o /dev/drbd1 /etc/mtab) ]]; then
break
fi
if [[ $? -ne 0 ]]; then
#try again, harder this time
umount -l /export >/dev/null 2>&1
if [[ -z $(grep -o /dev/drbd1 /etc/mtab) ]]; then
break
fi
fi
done
if [[ $test -eq 10 ]]; then
rm -f /tmp/heartbeatwrapper
echo "Problem unmount /export"
exit 1
fi
fi
#now, it's important that we shut down nfs. it can't write to /export anymore, so that's fine.
#if we leave it running at this point, then drbd will screwup when trying to go to secondary.
#See contradictory comment above for why this doesn't matter anymore. These comments are left in
#entirely to remind me of the pain this caused me to resolve. A bit like why churches have Jesus
#nailed onto a cross instead of chilling in a hammock.
pidof nfsd | xargs kill -9 >/dev/null 2>&1
sleep 1
if [[ -n $(ps aux | grep nfs | grep -v grep) ]]; then
echo "nfs still running, trying to kill again"
pidof nfsd | xargs kill -9 >/dev/null 2>&1
fi
sleep 1
/etc/init.d/nfs-kernel-server stop #>/dev/null 2>&1
sleep 1
#next we need to tear down drbd - easy with the heartbeat scripts
#it takes input as resourcename start|stop|status
#First, we'll check to see if it's stopped
/etc/ha.d/resource.d/drbddisk export status >/dev/null 2>&1
if [[ $? -eq 2 ]]; then
echo "resource is already stopped for some reason..."
else
for ((i=1; i <= 10; i++)); do
/etc/ha.d/resource.d/drbddisk export stop >/dev/null 2>&1
if [[ $(egrep -o "st:[A-Za-z/]*" /proc/drbd | cut -d: -f2) == "Secondary/Secondary" ]] || [[ $(egrep -o "st:[A-Za-z/]*" /proc/drbd | cut -d: -f2) == "Secondary/Unknown" ]]; then
echo "Successfully stopped DRBD"
break
else
echo "Failed to stop drbd for some reason"
cat /proc/drbd
if [[ $i -eq 10 ]]; then
exit 50
fi
fi
done
fi
rm -f /tmp/heartbeatwrapper
exit 0
elif [[ x$1 == "xstart" ]]; then
#start up drbd first
/etc/ha.d/resource.d/drbddisk export start >/dev/null 2>&1
if [[ $? -ne 0 ]]; then
echo "Something seems to have broken. Let's check possibilities..."
testvar=$(egrep -o "st:[A-Za-z/]*" /proc/drbd | cut -d: -f2)
if [[ $testvar == "Primary/Unknown" ]] || [[ $testvar == "Primary/Secondary" ]]
then
echo "All is fine, we are already the Primary for some reason"
elif [[ $testvar == "Secondary/Unknown" ]] || [[ $testvar == "Secondary/Secondary" ]]
then
echo "Trying to assume Primary again"
/etc/ha.d/resource.d/drbddisk export start >/dev/null 2>&1
if [[ $? -ne 0 ]]; then
echo "I give up, something's seriously broken here, and I can't help you to fix it."
rm -f /tmp/heartbeatwrapper
exit 127
fi
fi
fi
sleep 1
#now we remount our partitions
for ((test=1; test <= 10; test++)); do
mount /dev/drbd1 /export >/tmp/mountoutput
if [[ -n $(grep -o export /etc/mtab) ]]; then
break
fi
done
if [[ $test -eq 10 ]]; then
rm -f /tmp/heartbeatwrapper
exit 125
fi
#I'm really unsure at this point of the side-effects of not having rpc_pipefs mounted.
#The issue here, is that it cannot be mounted without nfs running, and we don't really want to start
#nfs up at this point, lest it ruin everything.
#For now, I'm leaving mine unmounted, it doesn't seem to cause any problems.
#Now we start up nfs.
/etc/init.d/nfs-kernel-server start >/dev/null 2>&1
if [[ $? -ne 0 ]]; then
echo "There's not really that much that I can do to debug nfs issues."
echo "probably your configuration is broken. I'm terminating here."
rm -f /tmp/heartbeatwrapper
exit 129
fi
#And that's it, done.
rm -f /tmp/heartbeatwrapper
exit 0
elif [[ "x$1" == "xstatus" ]]; then
#Lets check to make sure nothing is broken.
#DRBD first
/etc/ha.d/resource.d/drbddisk export status >/dev/null 2>&1
if [[ $? -ne 0 ]]; then
echo "stopped"
rm -f /tmp/heartbeatwrapper
exit 3
fi
#mounted?
grep -q drbd /etc/mtab >/dev/null 2>&1
if [[ $? -ne 0 ]]; then
echo "stopped"
rm -f /tmp/heartbeatwrapper
exit 3
fi
#nfs running?
/etc/init.d/nfs-kernel-server status >/dev/null 2>&1
if [[ $? -ne 0 ]]; then
echo "stopped"
rm -f /tmp/heartbeatwrapper
exit 3
fi
echo "running"
rm -f /tmp/heartbeatwrapper
exit 0
fi
完成上述所有操作后,您只需配置 /etc/exports
/export 172.20.1.0/255.255.255.0(rw,sync,fsid=1,no_root_squash)
那么这只是在两台机器上启动心跳并在其中一台上发出 hb_takeover 的情况。您可以通过确保发出接管的设备是主要设备来测试它是否正常工作 - 检查 /proc/drbd,设备安装正确,并且您可以访问 nfs。
--
祝你好运。对我来说,从头开始设置是一次非常痛苦的经历。