我一直在设置一个由 3 个节点(A、B、C)组成的 Mesos 集群,Mesos 主/从和 ZooKeeper 进程在每个 Docker 容器中运行。
由于集群设置包括docker run
使用 Ansible 执行,因此除了特定于节点的配置(主机名、zookeeper_myid 等)之外,3 个节点之间应该没有区别。
问题是...
节点 A 上的 Zookeeper 警告
Zookeeper仅在节点 A 上显示以下消息。
2015-05-25 03:28:06,060 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /<ip-nodeA>:58391
2015-05-25 03:28:06,060 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection request from old client /<ip-nodeA>:58391; will be dropped if server is in r-o mode
2015-05-25 03:28:06,060 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@841] - Refusing session request for client /<ip-nodeA>:58391 as it has seen zxid 0x44 our last zxid is 0xc client must try another server
2015-05-25 03:28:06,060 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /<ip-nodeA>:58391 (no session established for client)
节点 B 上的 Zookeeper 显示以下消息。
2015-05-25 03:12:18,594 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /<ip-nodeB>:42784 which had sessionid 0x14d89037c1e0000
2015-05-25 03:12:30,000 [myid:] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x14d89037c1e0000, timeout of 10000ms exceeded
2015-05-25 03:12:30,001 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x14d89037c1e0000
2015-05-25 03:12:30,987 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /<ip-nodeB>:42853
2015-05-25 03:12:30,987 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection request from old client /<ip-nodeB>:42853; will be dropped if server is in r-o mode
2015-05-25 03:12:30,988 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client attempting to establish new session at /<ip-nodeB>:42853
2015-05-25 03:12:30,997 [myid:] - INFO [SyncThread:0:ZooKeeperServer@617] - Established session 0x14d89037c1e0002 with negotiated timeout 10000 for client /<ip-nodeB>:42853
节点 C 上的 Zookeeper 显示以下消息。
2015-05-25 03:12:31,183 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /<ip-nodeA>:56496
2015-05-25 03:12:31,184 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection request from old client /<ip-nodeA>:56496; will be dropped if server is in r-o mode
2015-05-25 03:12:31,184 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client attempting to establish new session at /<ip-nodeA>:56496
2015-05-25 03:12:31,191 [myid:] - INFO [SyncThread:0:ZooKeeperServer@617] - Established session 0x14d89037ccd0002 with negotiated timeout 10000 for client /<ip-nodeA>:56496
节点 B 上的“当前没有主控...”
节点 C 被选为 master。访问节点 A 上的 mesos 管理页面成功重定向到节点 C。
但它不会将节点 B 重定向到节点 C,而是显示“当前没有主控...”。
主节点仅检测到 3 个从站中的 2 个
在主节点(当前节点 C)上,检测到 3 个从属节点中的 2 个。2个检测到的奴隶是;节点 A 和 C
那么,这些问题的可能原因是什么?
操作系统:CentOS 6.5
码头工人图像:
- Mesos Master:redjack/mesos-master
- Mesos Slave:redjack/mesos-slave
- ZooKeeper:数字仙境/动物园管理员
码头工人版本:
Client version: 1.5.0
Client API version: 1.17
Go version (client): go1.3.3
Git commit (client): a8a31ef/1.5.0
OS/Arch (client): linux/amd64
Server version: 1.5.0
Server API version: 1.17
Go version (server): go1.3.3
Git commit (server): a8a31ef/1.5.0