我在 Windows 2008 R2 上运行带有 Erlang OTP 17.1 的 RabbitMQ v3.3.5。我的 Dev 和 QA 环境是独立的。我的登台和生产环境是集群的。
我发现这个问题经常发生在运行 RabbitMQ 服务的地方,RabbitMQ 管理控制台可以看到所有内容,但是当我尝试从命令行运行 rabbitmqctl 时,它失败并出现错误,提示节点已关闭(在本地和远程服务器)。
如果我重新启动 Windows 服务,此问题将得到解决。
我在 RabbitMQ 错误日志中没有看到任何错误消息。最后一条消息表明节点已启动。
以下是我最近在暂存 Windows 集群的节点 2 上遇到的问题的示例输出:
PS C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.3.5\sbin> .\rabbitmqctl.bat status
Status of node rabbit@MYSERVER2 ...
Error: unable to connect to node rabbit@MYSERVER2: nodedown
DIAGNOSTICS
===========
attempted to contact: [rabbit@MYSERVER2]
rabbit@MYSERVER2:
* connected to epmd (port 4369) on MYSERVER2
* epmd reports: node 'rabbit' not running at all
no other nodes on MYSERVER2
* suggestion: start the node
current node details:
- node name: rabbitmqctl2199771@MYSERVER2
- home dir: C:\Users\RabbitMQ
- cookie hash: mn6OaTX9mS4DnZaiOzg8pA==
此时我重新启动 RabbitMQ 服务,然后再试一次
PS C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.3.5\sbin> .\rabbitmqctl.bat status
Status of node rabbit@MYSERVER2...
[{pid,3784},
{running_applications,
[{rabbitmq_management_agent,"RabbitMQ Management Agent","3.3.5"},
{rabbit,"RabbitMQ","3.3.5"},
{os_mon,"CPO CXC 138 46","2.2.15"},
{mnesia,"MNESIA CXC 138 12","4.12.1"},
{xmerl,"XML parser","1.3.7"},
{sasl,"SASL CXC 138 11","2.4"},
{stdlib,"ERTS CXC 138 10","2.1"},
{kernel,"ERTS CXC 138 10","3.0.1"}]},
{os,{win32,nt}},
{erlang_version,
"Erlang/OTP 17 [erts-6.1] [64-bit] [smp:4:4] [async-threads:30]\n"},
{memory,
[{total,35960208},
{connection_procs,2704},
{queue_procs,5408},
{plugins,111936},
{other_proc,13695792},
{mnesia,102296},
{mgmt_db,0},
{msg_index,21816},
{other_ets,884704},
{binary,25776},
{code,16672826},
{atom,602729},
{other_system,3834221}]},
{alarms,[]},
{listeners,[{clustering,25672,"::"},{amqp,5672,"::"},{amqp,5672,"0.0.0.0"}]},
{vm_memory_high_watermark,0.4},
{vm_memory_limit,3435787059},
{disk_free_limit,50000000},
{disk_free,74911649792},
{file_descriptors,
[{total_limit,8092},
{total_used,4},
{sockets_limit,7280},
{sockets_used,2}]},
{processes,[{limit,1048576},{used,139}]},
{run_queue,0},
{uptime,5}]
...done.
关于导致这种情况的原因以及如何自动检测这种情况的任何想法?
这是在 Windows 上运行 RabbitMQ 的具体问题吗?