7

我们为短信、电子邮件和推送通知创建了一个 Akka 集群基础设施。系统中存在三种不同类型的节点,分别是客户端、发送者和灯塔。Web 应用程序和 API 应用程序正在使用客户端角色(Web 和 API 托管在 IIS 上)。Lighthouse 和 Sender 角色作为 Windows 服务托管。我们还在发件人角色中运行另外 4 个相同 Windows 服务的控制台应用程序。

大约 2 周以来,我们的 Web 服务器一直在遇到端口耗尽问题。我们的 Web 服务器开始快速消耗端口,一段时间后我们无法执行任何 SQL 操作。有时我们别无选择,只能重置 iis。如果有多个节点处于发送者角色,则会出现此问题。我们对其进行了诊断并找到了问题的根源。

---------------
HOST                  OPEN    WAIT
SRV_NOTIFICATION      3429    0
SRV_LOCAL             198     0
SRV_UNDEFINED_IPV4    23      0
SRV_DATABASE          15      0
SRV_AUTH              4       0
SRV_API               6       0
SRV_UNDEFINED_IPV6    19      0
SRV_INBOUND           12347   5

TotalPortsInUse   : 17286
MaxUserPorts      : 64510
TcpTimedWaitDelay : 30
03/23/2017 09:30:10
---------------

SRV_NOTIFICATION 是灯塔 ve 发送者节点运行的服务器。SRV_INBOUND 是我们的 Web 服务器。检查此表后,我们检查了 Web 服务器上分配了哪些端口。我们得到了如下表所示的结果。在 netstat 中有超过 12000 个这样的连接:

TCP    192.168.1.10:65531     192.168.1.10:3564      ESTABLISHED     5716   [w3wp.exe]
TCP    192.168.1.10:65532     192.168.1.101:17527    ESTABLISHED     5716   [w3wp.exe]
TCP    192.168.1.10:65533     192.168.1.101:17527    ESTABLISHED     5716   [w3wp.exe]
TCP    192.168.1.10:65534     192.168.1.10:3564      ESTABLISHED     5716   [w3wp.exe]

192.168.1.10 Web 服务器 192.168.1.10:3564 API 192.168.1.101:17527 灯塔

连接正在打开但未关闭。

部署后,我们的 Web 和 Api 应用程序将离开并重新加入集群,并为固定端口配置。我们正在使用@cgstevens 创建的应用程序监控我们的集群。即使我们为 Actor System 实现了 grecaful 关闭逻辑,有时 WEB 和 API 应用程序无法离开集群,因此我们必须手动删除节点并重新启动 Actor 系统。

我们已经在我们的开发环境中重现了这个问题,并在下面录制了一个视频

https://drive.google.com/file/d/0B5ZNfLACId3jMWUyOWliMUhNWTQ/view

我们对节点的 hocon 配置如下:

网页和 API

<akka>
    <hocon><![CDATA[
            akka{
                loglevel = DEBUG

                actor{
                    provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"

                    deployment {
                        /coordinatorRouter {
                            router = round-robin-group
                            routees.paths = ["/user/NotificationCoordinator"]
                            cluster {
                                    enabled = on
                                    max-nr-of-instances-per-node = 1
                                    allow-local-routees = off
                                    use-role = sender
                            }
                        }

                        /decidingRouter {
                            router = round-robin-group
                            routees.paths = ["/user/NotificationDeciding"]
                            cluster {
                                    enabled = on
                                    max-nr-of-instances-per-node = 1
                                    allow-local-routees = off
                                    use-role = sender
                            }
                        }
                    }

                    serializers {
                            wire = "Akka.Serialization.HyperionSerializer, Akka.Serialization.Hyperion"
                    }

                    serialization-bindings {
                     "System.Object" = wire
                    }

                    debug{
                        receive = on
                        autoreceive = on
                        lifecycle = on
                        event-stream = on
                        unhandled = on
                    }
                }

                remote {
                    helios.tcp {
                            transport-class = "Akka.Remote.Transport.Helios.HeliosTcpTransport, Akka.Remote"
                            applied-adapters = []
                            transport-protocol = tcp
                            hostname = "192.168.1.10"
                            port = 3564
                    }
                }

                cluster {
                        seed-nodes = ["akka.tcp://notificationSystem@192.168.1.101:17527"]
                        roles = [client]
                }
            }
        ]]>
    </hocon>
</akka>

灯塔

<akka>
    <hocon>
        <![CDATA[
                lighthouse{
                        actorsystem: "notificationSystem"
                    }

                akka {
                    actor { 
                        provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"

                        serializers {
                            wire = "Akka.Serialization.HyperionSerializer, Akka.Serialization.Hyperion"
                        }

                        serialization-bindings {
                            "System.Object" = wire
                        }
                    }

                    remote {
                        log-remote-lifecycle-events = DEBUG
                        helios.tcp {
                            transport-class = "Akka.Remote.Transport.Helios.HeliosTcpTransport, Akka.Remote"
                            applied-adapters = []
                            transport-protocol = tcp
                            #will be populated with a dynamic host-name at runtime if left uncommented
                            #public-hostname = "192.168.1.100"
                            hostname = "192.168.1.101"
                            port = 17527
                        }
                    }            

                    loggers = ["Akka.Logger.NLog.NLogLogger,Akka.Logger.NLog"]

                    cluster {
                        seed-nodes = ["akka.tcp://notificationSystem@192.168.1.101:17527"]
                        roles = [lighthouse]
                    }
                }
        ]]>
    </hocon>
</akka>

发件人

<akka>
    <hocon><![CDATA[
                akka{
                    # stdout-loglevel = DEBUG
                    loglevel = DEBUG
                    # log-config-on-start = on

                    loggers = ["Akka.Logger.NLog.NLogLogger, Akka.Logger.NLog"]

                    actor{
                        debug {  
                            # receive = on 
                            # autoreceive = on
                            # lifecycle = on
                            # event-stream = on
                            # unhandled = on
                        }         

                        provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"           

                        serializers {
                            wire = "Akka.Serialization.HyperionSerializer, Akka.Serialization.Hyperion"
                        }

                        serialization-bindings {
                         "System.Object" = wire
                        }

                        deployment{                         
                            /NotificationCoordinator/LoggingCoordinator/DatabaseActor{
                                router = round-robin-pool
                                resizer{
                                    enabled = on
                                    lower-bound = 3
                                    upper-bound = 5
                                }
                            }                           

                            /NotificationDeciding/NotificationDecidingWorkerActor{
                                router = round-robin-pool
                                resizer{
                                    enabled = on
                                    lower-bound = 3
                                    upper-bound = 5
                                }
                            }

                            /ScheduledNotificationCoordinator/SendToProMaster/JobToProWorker{
                                router = round-robin-pool
                                resizer{
                                    enabled = on
                                    lower-bound = 3
                                    upper-bound = 5
                                }
                            }
                        }
                    }

                 remote{                            
                            log-remote-lifecycle-events = DEBUG
                            log-received-messages = on

                            helios.tcp{
                                transport-class = "Akka.Remote.Transport.Helios.HeliosTcpTransport, Akka.Remote"
                                applied-adapters = []
                                transport-protocol = tcp
                                #will be populated with a dynamic host-name at runtime if left uncommented
                                #public-hostname = "POPULATE STATIC IP HERE"
                                hostname = "192.168.1.101"
                                port = 0
                        }
                    }

                    cluster {
                        seed-nodes = ["akka.tcp://notificationSystem@192.168.1.101:17527"]
                        roles = [sender]
                    }
                }
            ]]></hocon>
</akka>

集群监视器

<akka>
    <hocon>
        <![CDATA[
                akka {
                    stdout-loglevel = INFO
                    loglevel = INFO
                    log-config-on-start = off 

                    actor {
                        provider = "Akka.Remote.RemoteActorRefProvider, Akka.Remote"                

                        serializers {
                            wire = "Akka.Serialization.HyperionSerializer, Akka.Serialization.Hyperion"
                        }
                        serialization-bindings {
                            "System.Object" = wire
                        }

                        deployment {                                
                            /clustermanager {
                                dispatcher = akka.actor.synchronized-dispatcher
                            }
                        }
                    }

                    remote {
                        log-remote-lifecycle-events = INFO
                        log-received-messages = off
                        log-sent-messages = off

                        helios.tcp {                                
                            transport-class = "Akka.Remote.Transport.Helios.HeliosTcpTransport, Akka.Remote"
                            applied-adapters = []
                            transport-protocol = tcp
                            #will be populated with a dynamic host-name at runtime if left uncommented
                            #public-hostname = "127.0.0.1"
                            hostname = "192.168.1.101"
                            port = 0
                        }
                    }            

                    cluster {                           
                    seed-nodes = ["akka.tcp://notificationSystem@192.168.1.101:17527"]
                        roles = [ClusterManager]

                        client {
                            initial-contacts = ["akka.tcp://notificationSystem@192.168.1.101:17527/system/receptionist"]
                        }
                    }
                }
        ]]>
    </hocon>
</akka>
4

1 回答 1

5

这是一个已确认的错误,可能会通过 Akka.Net V1.2 中的 CoordinatedShutdown 功能修复

https://github.com/akkadotnet/akka.net/issues/2575

在 1.2 发布之前,您可以使用最新的夜间构建

http://getakka.net/docs/akka-developers/nightly-builds

编辑:Akka.Net V1.2 发布,但这个错误推迟到 V1.3。

https://github.com/akkadotnet/akka.net/milestone/14

于 2017-03-31T06:21:55.327 回答