我对@Amir Soleimani有同样的问题,但错误结果有点不同,我尝试了该帖子中的所有解决方案,但所有解决方案都不起作用......我正在使用 Azure Kubernetes Service (AKS) 及之后从1.13.xx升级到1.18.xx不能再启动 RabbitMQ。
更新- 对我有用的解决方案(请考虑这种方法,因为它可能会影响您现有的队列)
Remove current rabbitmq StatefulSet including persistent disks
========
这是我的 StatefulSet 文件:
apiVersion: v1
kind: Service
metadata:
name: rabbitmq-management
labels:
app: rabbitmq
spec:
ports:
- port: 80
targetPort: 15672
name: http
selector:
app: rabbitmq
type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
name: rabbitmq
labels:
app: rabbitmq
spec:
ports:
- port: 5672
name: amqp
- port: 4369
name: epmd
- port: 25672
name: rabbitmq-dist
clusterIP: None
selector:
app: rabbitmq
---
apiVersion: v1
kind: Secret
metadata:
name: rabbitmq-config
namespace: default
type: Opaque
data:
erlang.cookie: samplecookie==
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: rabbitmq
labels:
app: rabbitmq
spec:
serviceName: rabbitmq
selector:
matchLabels:
app: rabbitmq
replicas: 3
template:
metadata:
labels:
app: rabbitmq
spec:
containers:
- name: rabbitmq
image: 'rabbitmq:3.6.6-management-alpine'
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- >
if [ -z "$(grep rabbitmq /etc/resolv.conf)" ]; then
sed "s/^search \([^ ]\+\)/search rabbitmq.\1 \1/" /etc/resolv.conf > /etc/resolv.conf.new;
cat /etc/resolv.conf.new > /etc/resolv.conf;
rm /etc/resolv.conf.new;
fi;
until rabbitmqctl node_health_check; do sleep 1; done;
if [[ "$HOSTNAME" != "rabbitmq-0" && -z "$(rabbitmqctl cluster_status | grep rabbitmq-0)" ]]; then
rabbitmqctl stop_app;
rabbitmqctl join_cluster rabbit@rabbitmq-0;
rabbitmqctl start_app;
fi;
rabbitmqctl set_policy ha-all "." '{"ha-mode":"exactly","ha-params":3,"ha-sync-mode":"automatic"}'
env:
- name: RABBITMQ_ERLANG_COOKIE
valueFrom:
secretKeyRef:
name: rabbitmq-config
key: erlang.cookie
- name: RABBITMQ_DEFAULT_USER
value: username
- name: RABBITMQ_DEFAULT_PASS
value: password
ports:
- containerPort: 5672
name: amqp
- containerPort: 15672
name: amqp-management
volumeMounts:
- mountPath: /var/lib/rabbitmq
name: volume
volumeClaimTemplates:
- metadata:
name: volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
的结果kubectl describe pod rabbitmq-0
DIAGNOSTICS
===========
attempted to contact: ['rabbit@rabbitmq-0']
rabbit@rabbitmq-0:
* connected to epmd (port 4369) on rabbitmq-0
* epmd reports: node 'rabbit' not running at all
no other nodes on rabbitmq-0
* suggestion: start the node
current node details:
- node name: 'rabbitmq-cli-91@rabbitmq-0'
- home dir: /var/lib/rabbitmq
- cookie hash: P1XNOe5pN3Ug2FCRFzH7Xg==
Error: unable to connect to node 'rabbit@rabbitmq-0': nodedown
DIAGNOSTICS
===========
attempted to contact: ['rabbit@rabbitmq-0']
rabbit@rabbitmq-0:
* connected to epmd (port 4369) on rabbitmq-0
* epmd reports: node 'rabbit' not running at all
no other nodes on rabbitmq-0
* suggestion: start the node
current node details:
- node name: 'rabbitmq-cli-26@rabbitmq-0'
- home dir: /var/lib/rabbitmq
- cookie hash: P1XNOe5pN3Ug2FCRFzH7Xg==
Error: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}
Error: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}
Error: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}
Error: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}
Error: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}
Error: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}
Error: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}
Error: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}
Error: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}
Error: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}
Error: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}
Error: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}
Error: rabbit application is not running on node rabbit@rabbitmq-0.
* Suggestion: start it with "rabbitmqctl start_app" and try again
, message: "Timeout: 70.0 seconds ...\nChecking health of node 'rabbit@rabbitmq-0' ...\nTimeout: 70.0 seconds ...\nChecking health of node 'rabbit@rabbitmq-0' ...\nTimeout: 70.0 seconds ...\nChecking health of node 'rabbit@rabbitmq-0' ...\nTimeout: 70.0 seconds ...\nChecking health of node 'rabbit@rabbitmq-0' ...\nTimeout: 70.0 seconds ...\nChecking health of node 'rabbit@rabbitmq-0' ...\nTimeout: 70.0 seconds ...\nChecking health of node 'rabbit@rabbitmq-0' ...\nTimeout: 70.0 seconds ...\nChecking health of node 'rabbit@rabbitmq-0' ...\nTimeout: 70.0 seconds ...\nChecking health of node 'rabbit@rabbitmq-0' ...\nTimeout: 70.0 seconds ...\nChecking health of node 'rabbit@rabbitmq-0' ...\nTimeout: 70.0 seconds ...\nChecking health of node 'rabbit@rabbitmq-0' ...\nTimeout: 70.0 seconds ...\nChecking health of node 'rabbit@rabbitmq-0' ...\nTimeout: 70.0 seconds ...\nChecking health of node 'rabbit@rabbitmq-0' ...\nError: unable to connect to node 'rabbit@rabbitmq-0': nodedown\n\nDIAGNOSTICS\n===========\n\nattempted to contact: ['rabbit@rabbitmq-0']\n\nrabbit@rabbitmq-0:\n * connected to epmd (port 4369) on rabbitmq-0\n * epmd reports: node 'rabbit' not running at all\n no other nodes on rabbitmq-0\n * suggestion: start the node\n\ncurrent node details:\n- node name: 'rabbitmq-cli-91@rabbitmq-0'\n- home dir: /var/lib/rabbitmq\n- cookie hash: P1XNOe5pN3Ug2FCRFzH7Xg==\n\nError: unable to connect to node 'rabbit@rabbitmq-0': nodedown\n\nDIAGNOSTICS\n===========\n\nattempted to contact: ['rabbit@rabbitmq-0']\n\nrabbit@rabbitmq-0:\n * connected to epmd (port 4369) on rabbitmq-0\n * epmd reports: node 'rabbit' not running at all\n no other nodes on rabbitmq-0\n * suggestion: start the node\n\ncurrent node details:\n- node name: 'rabbitmq-cli-26@rabbitmq-0'\n- home dir: /var/lib/rabbitmq\n- cookie hash: P1XNOe5pN3Ug2FCRFzH7Xg==\n\nError: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}\nError: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}\nError: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}\nError: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}\nError: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}\nError: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}\nError: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}\nError: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}\nError: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}\nError: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}\nError: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}\nError: {aborted,{no_exists,[rabbit_vhost,[{{vhost,'$1','_'},[],['$1']}]]}}\nError: rabbit application is not running on node rabbit@rabbitmq-0.\n * Suggestion: start it with \"rabbitmqctl start_app\" and try again\n"
Warning FailedPostStartHook 23m kubelet Exec lifecycle hook ([/bin/sh -c if [ -z "$(grep rabbitmq /etc/resolv.conf)" ]; then
sed "s/^search \([^ ]\+\)/search rabbitmq.\1 \1/" /etc/resolv.conf > /etc/resolv.conf.new;
cat /etc/resolv.conf.new > /etc/resolv.conf;
rm /etc/resolv.conf.new;
fi; until rabbitmqctl node_health_check; do sleep 1; done; if [[ "$HOSTNAME" != "rabbitmq-0" && -z "$(rabbitmqctl cluster_status | grep rabbitmq-0)" ]]; then
rabbitmqctl stop_app;
rabbitmqctl join_cluster rabbit@rabbitmq-0;
rabbitmqctl start_app;
fi; rabbitmqctl set_policy ha-all "." '{"ha-mode":"exactly","ha-params":3,"ha-sync-mode":"automatic"}'
]) for Container "rabbitmq" in Pod "rabbitmq-0_default(3ac91d73-de7b-4cde-81f6-c31bacd10252)" failed - error: command '/bin/sh -c if [ -z "$(grep rabbitmq /etc/resolv.conf)" ]; then
sed "s/^search \([^ ]\+\)/search rabbitmq.\1 \1/" /etc/resolv.conf > /etc/resolv.conf.new;
cat /etc/resolv.conf.new > /etc/resolv.conf;
rm /etc/resolv.conf.new;
fi; until rabbitmqctl node_health_check; do sleep 1; done; if [[ "$HOSTNAME" != "rabbitmq-0" && -z "$(rabbitmqctl cluster_status | grep rabbitmq-0)" ]]; then
rabbitmqctl stop_app;
rabbitmqctl join_cluster rabbit@rabbitmq-0;
rabbitmqctl start_app;
fi; rabbitmqctl set_policy ha-all "." '{"ha-mode":"exactly","ha-params":3,"ha-sync-mode":"automatic"}'
' exited with 137: Error: unable to connect to node 'rabbit@rabbitmq-0': nodedown
的结果kubectl logs rabbitmq-0
=CRASH REPORT==== 18-Jul-2021::11:06:01 ===
crasher:
initial call: application_master:init/4
pid: <0.156.0>
registered_name: []
exception exit: {{timeout_waiting_for_tables,
[rabbit_user,rabbit_user_permission,rabbit_vhost,
rabbit_durable_route,rabbit_durable_exchange,
rabbit_runtime_parameters,rabbit_durable_queue]},
{rabbit,start,[normal,[]]}}
in function application_master:init/4 (application_master.erl, line 134)
ancestors: [<0.155.0>]
messages: [{'EXIT',<0.157.0>,normal}]
links: [<0.155.0>,<0.31.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 987
stack_size: 27
reductions: 98
neighbours:
=INFO REPORT==== 18-Jul-2021::11:06:01 ===
application: rabbit
exited: {{timeout_waiting_for_tables,
[rabbit_user,rabbit_user_permission,rabbit_vhost,
rabbit_durable_route,rabbit_durable_exchange,
rabbit_runtime_parameters,rabbit_durable_queue]},
{rabbit,start,[normal,[]]}}
type: temporary
=INFO REPORT==== 18-Jul-2021::11:06:01 ===
application: amqp_client
exited: stopped
type: temporary
=INFO REPORT==== 18-Jul-2021::11:06:01 ===
application: rabbit_common
exited: stopped
type: temporary
=INFO REPORT==== 18-Jul-2021::11:06:01 ===
application: xmerl
exited: stopped
type: temporary
=INFO REPORT==== 18-Jul-2021::11:06:01 ===
application: os_mon
exited: stopped
type: temporary
=INFO REPORT==== 18-Jul-2021::11:06:01 ===
application: inets
exited: stopped
type: temporary
=INFO REPORT==== 18-Jul-2021::11:06:01 ===
application: asn1
exited: stopped
type: temporary
=INFO REPORT==== 18-Jul-2021::11:06:01 ===
application: syntax_tools
exited: stopped
type: temporary
=INFO REPORT==== 18-Jul-2021::11:06:01 ===
application: mnesia
exited: stopped
type: temporary
=INFO REPORT==== 18-Jul-2021::11:06:01 ===
application: crypto
exited: stopped
type: temporary
=INFO REPORT==== 18-Jul-2021::11:06:01 ===
application: ranch
exited: stopped
type: temporary
=INFO REPORT==== 18-Jul-2021::11:06:01 ===
application: compiler
exited: stopped
type: temporary
BOOT FAILED
===========
Timeout contacting cluster nodes: ['rabbit@rabbitmq-1','rabbit@rabbitmq-2'].
BACKGROUND
==========
This cluster node was shut down while other nodes were still running.
To avoid losing data, you should start the other nodes first, then
start this one. To force this node to start, first invoke
"rabbitmqctl force_boot". If you do so, any changes made on other
cluster nodes after this one was shut down may be lost.
DIAGNOSTICS
===========
attempted to contact: ['rabbit@rabbitmq-1','rabbit@rabbitmq-2']
rabbit@rabbitmq-1:
* unable to connect to epmd (port 4369) on rabbitmq-1: nxdomain (non-existing domain)
rabbit@rabbitmq-2:
* unable to connect to epmd (port 4369) on rabbitmq-2: nxdomain (non-existing domain)
current node details:
- node name: 'rabbit@rabbitmq-0'
- home dir: /var/lib/rabbitmq
- cookie hash: P1XNOe5pN3Ug2FCRFzH7Xg==
=INFO REPORT==== 18-Jul-2021::11:06:01 ===
Timeout contacting cluster nodes: ['rabbit@rabbitmq-1','rabbit@rabbitmq-2'].
BACKGROUND
==========
This cluster node was shut down while other nodes were still running.
To avoid losing data, you should start the other nodes first, then
start this one. To force this node to start, first invoke
"rabbitmqctl force_boot". If you do so, any changes made on other
cluster nodes after this one was shut down may be lost.
DIAGNOSTICS
===========
attempted to contact: ['rabbit@rabbitmq-1','rabbit@rabbitmq-2']
rabbit@rabbitmq-1:
* unable to connect to epmd (port 4369) on rabbitmq-1: nxdomain (non-existing domain)
rabbit@rabbitmq-2:
* unable to connect to epmd (port 4369) on rabbitmq-2: nxdomain (non-existing domain)
current node details:
- node name: 'rabbit@rabbitmq-0'
- home dir: /var/lib/rabbitmq
- cookie hash: P1XNOe5pN3Ug2FCRFzH7Xg==
{"init terminating in do_boot",timeout_waiting_for_tables}
init terminating in do_boot (timeout_waiting_for_tables)
Crash dump is being written to: erl_crash.dump...
我试过但没有奏效:
rabbitmqctl stop_app
rabbitmqctl force_boot
Remove StatefulSet and re-install
Re-configure the yaml file