2

我正在尝试配置 MongoDB 副本集,但每次尝试添加另一个成员时都会失败。

我有 3 个成员正在尝试配置。他们的 mongod.conf 文件都是这样的:

# mongo.conf

#where to log
logpath=/log/mongod.log

logappend=true

# fork and run in background
fork = true
smallfiles=true
rest=true
port = 27017
replSet=KidzpaceReplSet
dbpath=/data

随着港口的接受。它们分别是 27017(Primary)、27018(Secondary)和 27019(Arbiter)。

我已验证成员可以互相看到:

[ec2-user@domU-12-31-39-06-C4-74 ~]$ mongo --host 174.129.232.170 --port 27018
MongoDB shell version: 2.4.3
connecting to: 174.129.232.170:27018/test
> 

[ec2-user@domU-12-31-39-0A-30-E8 ~]$ mongo --host 174.129.230.20 --port 27017
MongoDB shell version: 2.4.3
connecting to: 174.129.230.20:27017/test
> 

将第二个成员添加到集合时,它返回 OK:

KidzpaceReplSet:PRIMARY> rs.add("174.129.232.170:27018")
{ "ok" : 1 }

但是无论我运行的下一个命令是什么,在这种情况下,它正在添加我的仲裁器,设置失败并出现以下错误:

KidzpaceReplSet:PRIMARY> rs.add("174.129.232.177:27019", true)
Tue May 28 20:24:07.139 DBClientCursor::init call() failed
Tue May 28 20:24:07.140 trying reconnect to 127.0.0.1:27017
Tue May 28 20:24:07.141 reconnect 127.0.0.1:27017 ok
reconnected to server after rs command (which is normal)

这是日志文件:

Tue May 28 20:44:06.173 [rsStart] replSet I am domU-12-31-39-06-C4-74:27017
Tue May 28 20:44:06.173 [rsStart] replSet STARTUP2
Tue May 28 20:44:07.175 [rsSync] replSet SECONDARY
Tue May 28 20:44:07.175 [rsMgr] replSet info electSelf 0
Tue May 28 20:44:08.174 [rsMgr] replSet PRIMARY
Tue May 28 20:44:29.813 [conn1] replSet replSetReconfig config object parses ok, 2 members specified
Tue May 28 20:44:29.817 [conn1] replSet replSetReconfig [2]
Tue May 28 20:44:29.817 [conn1] replSet info saving a newer config version to local.system.replset
Tue May 28 20:44:29.834 [conn1] replSet saveConfigLocally done
Tue May 28 20:44:29.834 [conn1] replSet info : additive change to configuration
Tue May 28 20:44:29.834 [conn1] replSet replSetReconfig new config saved locally
Tue May 28 20:44:39.835 [rsHealthPoll] DBClientCursor::init call() failed
Tue May 28 20:44:39.835 [rsHealthPoll] replset info 174.129.232.170:27018 heartbeat failed, retrying
Tue May 28 20:44:40.834 [rsHealthPoll] DBClientCursor::init call() failed
Tue May 28 20:44:40.834 [rsHealthPoll] replSet info 174.129.232.170:27018 is down (or slow to respond):
Tue May 28 20:44:40.835 [rsHealthPoll] replSet member 174.129.232.170:27018 is now in state DOWN
Tue May 28 20:44:40.835 [rsMgr] replSet total number of votes is even - add arbiter or give one member an extra vote
Tue May 28 20:44:40.835 [rsMgr] can't see a majority of the set, relinquishing primary
Tue May 28 20:44:40.835 [rsMgr] replSet relinquishing primary state
Tue May 28 20:44:40.835 [rsMgr] replSet SECONDARY
Tue May 28 20:44:40.835 [rsMgr] replSet closing client sockets after relinquishing primary
Tue May 28 20:44:42.044 [conn1] end connection 127.0.0.1:58727 (0 connections now open)
Tue May 28 20:44:46.150 [rsHealthPoll] replSet member 174.129.232.170:27018 is up
Tue May 28 20:44:46.151 [rsMgr] replSet not electing self, not all members up and we have been up less than 5 minutes
Tue May 28 20:44:52.156 [rsMgr] replSet not electing self, not all members up and we have been up less than 5 minutes

更新

我想知道问题是否出在我运行 rs.initiate() 时。它给了我这个输出:

{
    "set" : "KidzpaceReplSet",
    "date" : ISODate("2013-05-28T20:59:05Z"),
    "myState" : 1,
    "members" : [
        {
            "_id" : 0,
            "name" : "domU-12-31-39-06-C4-74:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 23,
            "optime" : {
                "t" : 1369774732,
                "i" : 1
            },
            "optimeDate" : ISODate("2013-05-28T20:58:52Z"),
            "self" : true
        }
    ],
    "ok" : 1
}

注意成员的名字?"name" : "domU-12-31-39-06-C4-74:27017"这个名字从何而来?这不是我的 IP 地址。我不确定,但也许这可能是问题的根源。

4

1 回答 1

4

所以事实证明 rs.initiate() 可能会为启动它的成员提供某种内部别名来表示它的 IP 地址。就我而言,它是:domU-12-31-39-06-C4-74。

与辅助节点的初始连接很好,因为主节点发起了它。但是,由于辅助服务器现在在尝试与主服务器对话时使用此别名,因此它失败了。

解决方案是复制现有配置:cfg = rs.conf()

手动更改主节点的名称(主机):cfg.members[0].host = 666.666.666.666:27017

并重新配置副本集:rs.reconfig(cfg)

于 2013-05-29T19:00:13.917 回答