0

我决定尝试将当前集群从 ES2.1.1 升级到 ES2.2.0。一对镜子。集群在 AWS 中运行,所以我使用cloud-aws插件进行通信。

我成功升级了第一个节点,它已成为主节点,但在升级第二个节点时遇到了一个奇怪的通信/身份验证问题。

我注意了这里的指南,但我似乎仍然遇到了一个奇怪的问题。

从第二个节点上的主集群日志:

[2016-02-03 12:29:41,241][INFO ][discovery.ec2            ] [Sharon Ventura] failed to send join request to master [{Space Phantom}{NzN7b7ZHT8uPu6oXJAORMg}{10.60.164.147}{10.60.164.147:9300}], reason [RemoteTransportException[[Space Phantom][10.60.164.147:9300][internal:discovery/zen/join]]; nested: IllegalStateException[failure when sending a validation request to node]; nested: RemoteTransportException[[Sharon Ventura][10.60.163.74:9300][internal:discovery/zen/join/validate]]; nested: ElasticsearchSecurityException[missing authentication token for action [internal:discovery/zen/join/validate]]; ]
[2016-02-03 12:29:42,455][DEBUG][action.admin.cluster.health] [Sharon Ventura] no known master node, scheduling a retry
[2016-02-03 12:29:44,255][INFO ][discovery.ec2            ] [Sharon Ventura] failed to send join request to master [{Space Phantom}{NzN7b7ZHT8uPu6oXJAORMg}{10.60.164.147}{10.60.164.147:9300}], reason [RemoteTransportException[[Space Phantom][10.60.164.147:9300][internal:discovery/zen/join]]; nested: IllegalStateException[failure when sending a validation request to node]; nested: RemoteTransportException[[Sharon Ventura][10.60.163.74:9300][internal:discovery/zen/join/validate]]; nested: ElasticsearchSecurityException[missing authentication token for action [internal:discovery/zen/join/validate]]; ]
[2016-02-03 12:29:47,269][INFO ][discovery.ec2            ] [Sharon Ventura] failed to send join request to master [{Space Phantom}{NzN7b7ZHT8uPu6oXJAORMg}{10.60.164.147}{10.60.164.147:9300}], reason [RemoteTransportException[[Space Phantom][10.60.164.147:9300][internal:discovery/zen/join]]; nested: IllegalStateException[failure when sending a validation request to node]; nested: RemoteTransportException[[Sharon Ventura][10.60.163.74:9300][internal:discovery/zen/join/validate]]; nested: ElasticsearchSecurityException[missing authentication token for action [internal:discovery/zen/join/validate]]; ]
[2016-02-03 12:29:49,472][DEBUG][action.admin.cluster.state] [Sharon Ventura] timed out while retrying [cluster:monitor/state] after failure (timeout [30s])
[2016-02-03 12:29:49,473][INFO ][rest.suppressed          ] /_cluster/settings Params: {}
MasterNotDiscoveredException[null]
        at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$5.onTimeout(TransportMasterNodeAction.java:205)
        at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:239)
        at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:794)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
[2016-02-03 12:29:50,283][INFO ][discovery.ec2            ] [Sharon Ventura] failed to send join request to master [{Space Phantom}{NzN7b7ZHT8uPu6oXJAORMg}{10.60.164.147}{10.60.164.147:9300}], reason [RemoteTransportException[[Space Phantom][10.60.164.147:9300][internal:discovery/zen/join]]; nested: IllegalStateException[failure when sending a validation request to node]; nested: RemoteTransportException[[Sharon Ventura][10.60.163.74:9300][internal:discovery/zen/join/validate]]; nested: ElasticsearchSecurityException[missing authentication token for action [internal:discovery/zen/join/validate]]; ]

我的 elasticsearch.yml 文件:

cluster.name: cluster01
http.cors.enabled: true
network.host: 0.0.0.0
discovery.type: ec2
discovery.ec2.tag.project_code_info: "cluster01"
cloud.aws.region: eu-central-1

我可以在日志中看到它检测到第一个节点:[Space Phantom][10.60.164.147:9300] 它在没有任何干预的情况下检测到它,但它显然无法进行身份验证。

我怀疑这可能与Shield插件有关,该插件也已安装,但正确且相同的权限设置与以前相同。其他一切都没有改变。

我在 shield 中使用用户名和密码,没有配置 SSL。

有人可以帮忙吗?

4

1 回答 1

1

正如@user3458016 所要求的,我设法弄清楚了。

我设法通过(在所有节点上)重置所有设置和配置、删除插件licenseshield删除所有用户并像以前一样重新添加所有用户来解决这个问题。这些配置一开始是相同的,所以这很奇怪。

首先,在所有节点上停止 elasticsearch。如果在本地运行,请停止 kibana。

如果您有任何自定义角色/etc/elasticsearch/shield/roles.yml,请在 可能的情况下从单个记录的配置刷新此内容中检查此配置。

删除插件:

/usr/share/elasticsearch/bin/plugin remove elasticsearch/license/latest /usr/share/elasticsearch/bin/plugin remove elasticsearch/shield/latest

删除用户:

/usr/share/elasticsearch/bin/shield/esusers userdel admin /usr/share/elasticsearch/bin/shield/esusers userdel logstash

重新添加插件:

/usr/share/elasticsearch/bin/plugin install elasticsearch/license/latest -b /usr/share/elasticsearch/bin/plugin install elasticsearch/shield/latest -b

重新添加用户:

/usr/share/elasticsearch/bin/shield/esusers useradd admin -p adminuserpw -r admin /usr/share/elasticsearch/bin/shield/esusers useradd logstash -p logstashuserpw -r logstash

如果您有任何自定义角色,请仔细检查此配置/etc/elasticsearch/shield/roles.yml以验证配置未被修改或覆盖。

在第一个节点上启动 elasticsearch 。如果在本地运行,请启动 kibana。

检查索引是否正确出现并验证主节点状态

在所有其他节点上执行上述所有步骤。

在剩余节点上启动 elasticsearch,一次一个。在启动下一个节点之前验证健康的集群复制。

我希望有人觉得这很有用。

于 2016-02-29T16:16:22.017 回答