我有一个三节点 hadoop 集群正在运行。出于某种原因,当数据节点从属设备启动时,它们会使用我的网络上甚至不存在的 IP 地址来识别。这是我的主机名和 IP 映射。
nodes:
- hostname: hadoop-master
ip: 192.168.51.4
- hostname: hadoop-data1
ip: 192.168.52.4
- hostname: hadoop-data2
ip: 192.168.52.6
正如您在下面看到的,hadoop-master 节点正常启动,但在其他两个节点中,只有一个显示为 Live 数据节点,并且无论哪个显示始终具有 IP 192.168.51.1,如您在上面看到的我的网络上什至不存在。
hadoop@hadoop-master:~$ hdfs dfsadmin -report
Safe mode is ON
Configured Capacity: 84482326528 (78.68 GB)
Present Capacity: 75735965696 (70.53 GB)
DFS Remaining: 75735281664 (70.53 GB)
DFS Used: 684032 (668 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (2):
Name: 192.168.51.1:50010 (192.168.51.1)
Hostname: hadoop-data2
Decommission Status : Normal
Configured Capacity: 42241163264 (39.34 GB)
DFS Used: 303104 (296 KB)
Non DFS Used: 4305530880 (4.01 GB)
DFS Remaining: 37935329280 (35.33 GB)
DFS Used%: 0.00%
DFS Remaining%: 89.81%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Fri Sep 25 13:54:23 UTC 2015
Name: 192.168.51.4:50010 (hadoop-master)
Hostname: hadoop-master
Decommission Status : Normal
Configured Capacity: 42241163264 (39.34 GB)
DFS Used: 380928 (372 KB)
Non DFS Used: 4440829952 (4.14 GB)
DFS Remaining: 37799952384 (35.20 GB)
DFS Used%: 0.00%
DFS Remaining%: 89.49%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Fri Sep 25 13:54:21 UTC 2015
我确实尝试为每个主机显式添加 dfs.datanode.address,但在这种情况下,它甚至无法显示为活动节点。这是我的 hdfs-site.xml 的样子(注意我已经尝试过设置 dfs.datanode.address 和不存在)。
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.rpc-bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>dfs.datanode.address</name>
<value>192.168.51.4:50010</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/hadoop-data/hdfs/namenode</value>
<description>Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/hadoop-data/hdfs/datanode</value>
<description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.</description>
</property>
</configuration>
为什么 hadoop 将每个数据节点与一个甚至不存在的 IP 相关联?或者更重要的是,我怎样才能让节点正常运行?
更新:所有节点上的文件 /etc/hosts 是相同的
192.168.51.4 hadoop-master
192.168.52.4 hadoop-data1
192.168.52.6 hadoop-data2
以下是我的奴隶文件的内容。
hadoop@hadoop-master:~$ cat /usr/local/hadoop/etc/hadoop/slaves
hadoop-master
hadoop-data1
hadoop-data2
数据节点日志:
https ://gist.github.com/dwatrous/7241bb804a9be8f9303f https://gist.github.com/dwatrous/bcd85cda23d6eca3a68b https://gist.github.com/dwatrous/922c4f773aded0137fa3
名称节点日志:
https ://gist.github.com/dwatrous/dafaa7695698f36a5d93