1

我目前正在 3 个节点上运行 docker swarm。首先,我将网络创建为

docker network create -d overlay xx_net

之后作为服务

docker service create --network xxx_net --replicas 1 -p 12345:12345 --name nameofservice nameofimage:1

如果我没看错,这是路由网格(= 对我来说还可以)。但是我只能在运行容器的那个节点 ip 上访问服务,即使它应该在每个节点 ip 上都可用。

如果我耗尽了某个节点,容器会在不同的节点上启动,然后它就可以在新的 ip 上使用。


**此处添加了更多信息:

  • 我重新启动了所有服务器 - 3 个工作人员,其中一个是经理
  • 启动后,似乎一切正常!
  • 我正在使用来自 docker hub 的 rabbitmq-image。Dockerfile 非常小:FROM rabbitmq:3-management容器已在 worker 2 处启动
  • 我可以从所有worker连接到rabbitmq的管理页面:worker1-ip:15672,worker2-ip:15672,worker3-ip:15672,所以我认为所有需要的端口都是开放的。
  • 大约 1 小时后,rabbitmq-container 已从工人 2 转移到工人 3 - 我不知道原因。
  • 之后,我无法再从 worker1-ip:15672、worker2-ip:15672 连接,但从 worker3-ip:15672 仍然可以正常工作!
  • 我把worker3抽干了docker node update --availability drain worker3
  • 容器从worker1开始。
  • 之后我只能从worker1-ip:15672连接,不能再从worker2或worker3连接

再进行一项测试:所有 docker 服务都在所有工作人员上重新启动,并且一切正常?!- 让我们等几个小时...

今天的状态:3 个节点中有 2 个工作正常。在经理的服务日志上:

Jul 12 07:53:32 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:53:32.787953754Z" level=info msg="memberlist: Marking dockerswarmworker2-459b4229d652 as failed, suspect timeout reached"
Jul 12 07:53:39 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:53:39.787783458Z" level=info msg="memberlist: Marking dockerswarmworker2-459b4229d652 as failed, suspect timeout reached"
Jul 12 07:55:27 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:55:27.790564790Z" level=info msg="memberlist: Marking dockerswarmworker2-459b4229d652 as failed, suspect timeout reached"
Jul 12 07:55:41 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:55:41.787974530Z" level=info msg="memberlist: Marking dockerswarmworker2-459b4229d652 as failed, suspect timeout reached"
Jul 12 07:56:33 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:56:33.027525926Z" level=error msg="logs call failed" error="container not ready for logs: context canceled" module="node/agent/taskmanager" node.id=b6vnaouyci7b76ol1apq96zxx
Jul 12 07:56:33 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:56:33.027668473Z" level=error msg="logs call failed" error="container not ready for logs: context canceled" module="node/agent/taskmanager" node.id=b6vnaouyci7b76ol1apq96zxx
Jul 12 08:13:22 dockerswarmmanager dockerd[7180]: time="2017-07-12T08:13:22.787796692Z" level=info msg="memberlist: Marking dockerswarmworker2-03ec8453a81f as failed, suspect timeout reached"
Jul 12 08:21:37 dockerswarmmanager dockerd[7180]: time="2017-07-12T08:21:37.788694522Z" level=info msg="memberlist: Marking dockerswarmworker2-03ec8453a81f as failed, suspect timeout reached"
Jul 12 08:24:01 dockerswarmmanager dockerd[7180]: time="2017-07-12T08:24:01.525570127Z" level=error msg="logs call failed" error="container not ready for logs: context canceled" module="node/agent/taskmanager" node.id=b6vnaouyci7b76ol1apq96zxx
Jul 12 08:24:01 dockerswarmmanager dockerd[7180]: time="2017-07-12T08:24:01.525713893Z" level=error msg="logs call failed" error="container not ready for logs: context canceled" module="node/agent/taskmanager" node.id=b6vnaouyci7b76ol1apq96zxx

并从工人的码头日志:

Jul 12 08:20:47 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:20:47.486202716Z" level=error msg="Bulk sync to node h999-99-999-185.scenegroup.fi-891b24339f8a timed out"
Jul 12 08:21:38 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:38.288117026Z" level=warning msg="memberlist: Refuting a dead message (from: h999-99-999-185.scenegroup.fi-891b24339f8a)"
Jul 12 08:21:39 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:39.404554761Z" level=warning msg="Neighbor entry already present for IP 10.255.0.3, mac 02:42:0a:ff:00:03"
Jul 12 08:21:39 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:39.404588738Z" level=warning msg="Neighbor entry already present for IP 104.198.180.163, mac 02:42:0a:ff:00:03"
Jul 12 08:21:39 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:39.404609273Z" level=warning msg="Neighbor entry already present for IP 10.255.0.6, mac 02:42:0a:ff:00:06"
Jul 12 08:21:39 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:39.404622776Z" level=warning msg="Neighbor entry already present for IP 104.198.180.163, mac 02:42:0a:ff:00:06"
Jul 12 08:21:47 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:47.486007317Z" level=error msg="Bulk sync to node h999-99-999-185.scenegroup.fi-891b24339f8a timed out"
Jul 12 08:22:47 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:22:47.485821037Z" level=error msg="Bulk sync to node h999-99-999-185.scenegroup.fi-891b24339f8a timed out"
Jul 12 08:23:17 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:23:17.630602898Z" level=error msg="Bulk sync to node h999-99-999-185.scenegroup.fi-891b24339f8a timed out"

而这个来自工作人员:

Jul 12 08:33:09 h999-99-999-185.scenegroup.fi dockerd[10330]: time="2017-07-12T08:33:09.219973777Z" level=warning msg="Neighbor entry already present for IP 10.0.0.3, mac xxxxx"
Jul 12 08:33:09 h999-99-999-185.scenegroup.fi dockerd[10330]: time="2017-07-12T08:33:09.220539013Z" level=warning msg="Neighbor entry already present for IP "managers ip here", mac xxxxxx"

我在有问题的工人上重新启动了 docker,它又开始工作了。我会跟...

** 今天的结果:

  • 2 名工人可用,1 名不可用
  • 我什么都没有
  • 经过 4 小时的“单独蜂群”,一切似乎又恢复了?!
  • 出于任何充分的理由,服务已从工作人员转移到其他工作人员,所有结果似乎都是沟通问题。
  • 相当混乱。
4

1 回答 1

0

升级到 docker 17.06

入口覆盖网络中断了很长时间,直到大约 17.06-rc3

于 2017-09-22T16:45:29.277 回答