0

当将 AWS 中的 ElastiCache 从 3 个分片扩展到 12 个时,我的服务开始抛出 500 个错误,并且与客户端的连接丢失。检查日志时,我看到以下错误:

https://paste-bin.xyz/14386(整个堆栈跟踪太大而无法发布)

Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.GeneratedConstructorAccessor8.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at io.netty.channel.ReflectiveChannelFactory.newChannel(ReflectiveChannelFactory.java:44)
    ... 46 more
Caused by: io.netty.channel.ChannelException: Failed to open a socket.
    at io.netty.channel.socket.nio.NioSocketChannel.newSocket(NioSocketChannel.java:71)
    at io.netty.channel.socket.nio.NioSocketChannel.<init>(NioSocketChannel.java:88)
    at io.netty.channel.socket.nio.NioSocketChannel.<init>(NioSocketChannel.java:81)
    ... 50 more
Caused by: java.net.SocketException: Too many open files
    at sun.nio.ch.Net.socket0(Native Method)
    at sun.nio.ch.Net.socket(Net.java:439)
    at sun.nio.ch.Net.socket(Net.java:432)
    at sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:103)
    at sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:60)
    at io.netty.channel.socket.nio.NioSocketChannel.newSocket(NioSocketChannel.java:69)
    ... 52 more

从上面的修剪日志看来,客户端无法跟上的连接请求太多。但我不确定我在这里是否正确。

我将分片数从 12 减少到 7,并且在日志中没有看到上述错误。但是,当有 3 个分片时,会有更多的缓存未命中。分片配置与 1 个主节点和 3 个工作节点相同。我的机器最多可以处理 65535 个文件描述符,我认为这对于 12 个分片来说已经足够了。任何指向正在发生的事情的指针都非常感谢!

4

0 回答 0