我们有两个具有相同代码的节点,它们在集群中使用 akka.net,并在它们之间使用远程发送消息。
Akka.Net 版本是 1.2.0,我们使用 dot-netty 进行传输。这是相关的配置部分:
actor {
provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"
}
remote {
dot-netty.tcp {
port = 34083
hostname = host_name
}
}
这两个节点运行在不同的 Windows 服务器上(托管在 Windows 服务上)。有时,节点停止侦听分配的端口(使用 netstat -an 检查)并且它们之间的所有通信都会丢失,直到我重新启动 Windows 服务。
这是我们在日志中获得的所有信息(前 2 条消息来自一个主机,第三条来自另一个主机):
我猜传输层出现故障,dot-netty 关闭套接字并停止侦听。
有什么办法可以阻止这种情况的发生或至少降低它的频率?如果没有,我们可以挂钩失败事件以重新开始收听吗?
60133 2017-08-11 10:09:11.993 Host1 Akka.Remote.Transport.ProtocolStateActor Error No response from remote. Handshake timed out or transport failure detector triggered.
60134 2017-08-11 10:09:12.040 Host1 Akka.Remote.ReliableDeliverySupervisor Warn Association with remote system akka.tcp://ProcesamientoActorSystem@warpacb004.nead.danet:34083 has failed; address is now gated for 5000 ms. Reason is: [Akka.Remote.EndpointDisassociatedException: Disassociated
at Akka.Remote.EndpointWriter.PublishAndThrow(Exception reason, LogLevel level, Boolean needToThrow)
at Akka.Actor.ReceiveActor.ExecutePartialMessageHandler(Object message, PartialAction
1 partialAction)
at Akka.Actor.UntypedActor.Receive(Object message)
at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)
at Akka.Actor.ActorCell.ReceiveMessage(Object message)
at Akka.Actor.ActorCell.AutoReceiveMessage(Envelope envelope)
at Akka.Actor.ActorCell.Invoke(Envelope envelope)
--- End of stack trace from previous location where exception was thrown ---
at Akka.Actor.ActorCell.HandleFailed(Failed f)
at Akka.Actor.ActorCell.SysMsgInvokeAll(EarliestFirstSystemMessageList messages, Int32 currentState)]
1 partialAction)
at Akka.Actor.ActorCell.<>c__DisplayClass112_0.b__0(Object m)
at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)
at Akka.Actor.ActorCell.ReceiveMessage(Object message)
at Akka.Actor.ActorCell.AutoReceiveMessage(Envelope envelope)
at Akka.Actor.ActorCell.Invoke(Envelope envelope)]
60135 2017-08-11 10:09:14.313 Host2 Akka.Remote.ReliableDeliverySupervisor Warn Association with remote system akka.tcp://ProcesamientoActorSystem@warpacb005.nead.danet:34083 has failed; address is now gated for 5000 ms. Reason is: [Akka.Remote.EndpointDisassociatedException: Disassociated
at Akka.Remote.EndpointWriter.PublishAndThrow(Exception reason, LogLevel level, Boolean needToThrow)
at Akka.Actor.ReceiveActor.ExecutePartialMessageHandler(Object message, PartialAction