我想我对 Hadoop Cluster 中的数据节点一定有一些误解。我有一个由 master、slave1、slave2、slave3 组成的 hadoop 虚拟集群。Master和slave1在一台物理机上,slave2和slave3在一台物理机上。当我启动集群时,在 HDFS webUI 中,我只能看到三个活的数据节点,slave1、master、slave2。但有时,三个活着的数据节点是 master、slave1、slave3。那很奇怪。我ssh到未启动的数据节点,虽然我执行jps并没有找到数据节点,但我仍然可以在这个节点上复制和删除HDFS上的文件。所以我相信我一定不能正确理解datanode。我在这里有三个问题。1 每个节点是否有一个数据节点?2 为什么不是datanode的节点仍然可以在HDFS上读写?3 我们可以决定datanode的数量吗?
这是未启动的数据节点的日志:
STARTUP_MSG: Starting DataNode STARTUP_MSG: host = slave11/192.168.111.31 STARTUP_MSG: args = [] STARTUP_MSG: version = 1.0.3 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch- 1.0 -r 1335192; compiled by 'hortonfo' on Tue May 8 20:31:25 UTC 2012 ************************************************************/ 2012-08-03 17:47:07,578 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2012-08-03 17:47:07,595 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered. 2012-08-03 17:47:07,596 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2012-08-03 17:47:07,596 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started 2012-08-03 17:47:07,911 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered. 2012-08-03 17:47:07,915 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists! 2012-08-03 17:47:09,457 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.111.21:54310. Already tried 0 time(s). 2012-08-03 17:47:10,460 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.111.21:54310. Already tried 1 time(s). 2012-08-03 17:47:11,464 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.111.21:54310. Already tried 2 time(s). 2012-08-03 17:47:19,565 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean 2012-08-03 17:47:19,601 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server at 50010 2012-08-03 17:47:19,620 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 1048576 bytes/s 2012-08-03 17:47:24,721 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2012-08-03 17:47:24,854 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter) 2012-08-03 17:47:24,952 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dfs.webhdfs.enabled = false 2012-08-03 17:47:24,953 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50075 2012-08-03 17:47:24,953 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50075 webServer.getConnectors()[0].getLocalPort() returned 50075 2012-08-03 17:47:24,953 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075 2012-08-03 17:47:24,953 INFO org.mortbay.log: jetty-6.1.26 2012-08-03 17:47:25,665 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50075
2012-08-03 17:47:25,688 信息 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:已注册源 jvm 的 MBean。2012-08-03 17:47:25,690 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:已注册源 DataNode 的 MBean。2012-08-03 17:47:30,717 INFO org.apache.hadoop.ipc.Server:启动 SocketReader 2012-08-03 17:47:30,718 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:源 RpcDetailedActivityForPort50020 的 MBean挂号的。2012-08-03 17:47:30,718 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:已注册源 RpcActivityForPort50020 的 MBean。2012-08-03 17:47:30,721 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration = DatanodeRegistration(slave11:50010, storageID=DS-1062340636-127.0.0.1-50010-1339803955209, infoPort=50075 , 50020 上的 IPC 服务器侦听器:从 2012-08-03 17:47:30,773 开始 INFO org.apache.hadoop.ipc.Server:50020 上的 IPC 服务器处理程序 0:从 2012-08-03 17:47:30,773 INFO org.apache .hadoop.ipc.Server:50020 上的 IPC 服务器处理程序 1:从 2012-08-03 17:47:30,795 开始信息 org.apache.hadoop.hdfs.server.datanode.DataNode:启动定期块扫描程序。2012-08-03 17:47:30,816 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:在 52 毫秒内完成异步块报告扫描 2012-08-03 17:47:30,838 INFO org.apache.hadoop.hdfs .server.datanode.DataNode:在 32 毫秒内生成粗略(无锁)块报告 2012-08-03 17:47:30,840 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:根据当前状态调整异步块报告2 毫秒 2012-08-03 17:47:31,158 信息 org.apache.hadoop.hdfs.server.datanode.DataBlockScanner:blk_-6072482390929551157_78209 验证成功 2012-08-03 17:47:33,775 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:在 1 毫秒内根据当前状态协调异步块报告 2012-08-03 17:47: 33,793 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:DataNode 正在关闭:org.apache.hadoop.ipc.RemoteException:org.apache.hadoop.hdfs.protocol.UnregisteredDatanodeException:数据节点 192.168.111.31:50010正在尝试报告存储 ID DS-1062340636-127.0.0.1-50010-1339803955209。预计节点 192.168.111.32:50010 将为该存储提供服务。在 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDatanode(FSNamesystem.java:4608) 在 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processReport(FSNamesystem.java:3460) 在 org.apache .hadoop.hdfs.server.namenode.NameNode。
at org.apache.hadoop.ipc.Client.call(Client.java:1070)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy5.blockReport(Unknown Source)
at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:958)
at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
at java.lang.Thread.run(Thread.java:636)
2012-08-03 17:47:33,873 信息 org.mortbay.log:已停止 SelectChannelConnector@0.0.0.0:50075 2012-08-03 17:47:33,980 信息 org.apache.hadoop.ipc.Server:停止 50020 上的服务器2012-08-03 17:47:33,981 信息 org.apache.hadoop.ipc.Server:50020 上的 IPC 服务器处理程序 0:退出 2012-08-03 17:47:33,981 信息 org.apache.hadoop.ipc.Server: 50020 上的 IPC 服务器处理程序 2:退出 2012-08-03 17:47:33,981 INFO org.apache.hadoop.ipc.Server:50020 上的 IPC 服务器处理程序 1:退出
2012-08-03 17:47:33,981 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: exiting
2012-08-03 17:47:33,981 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: exiting
2012-08-03 17:47:33,981 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020: exiting
2012-08-03 17:47:33,981 INFO org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down
2012-08-03 17:47:33,982 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.111.31:50010, storageID=DS-1062340636-127.0.0.1-50010-1339803955209, infoPort=50075, ipcPort=50020):DataXceiveServer:java.nio.channels.AsynchronousCloseException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:170)
at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:102)
at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:131)
at java.lang.Thread.run(Thread.java:636)
2012-08-03 17:47:33,982 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 50020
2012-08-03 17:47:33,982 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting DataXceiveServer
2012-08-03 17:47:33,983 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2012-08-03 17:47:33,982 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:等待线程组退出,活动线程为 1 2012-08-03 17:47:33,984 INFO org.apache。 hadoop.hdfs.server.datanode.DataBlockScanner:退出 DataBlockScanner 线程。2012-08-03 17:47:33,985 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:关闭所有异步磁盘服务线程... 2012-08-03 17:47:33,985 INFO org.apache。 hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:所有异步磁盘服务线程已关闭。2012-08-03 17:47:33,985 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.111.31:50010, storageID=DS-1062340636-127.0.0.1-50010-1339803955209, infoPort=50075 , ipcPort=50020):完成DataNode in: FSDataset{dirpath='/app/hadoop/tmp/dfs/data/current'} 2012-08-03 17:47:
在 com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:540) 在 org.apache.hadoop.metrics2.util.MBeans.unregister(MBeans.java:71) 在 org.apache.hadoop.hdfs.server .datanode.FSDataset.shutdown(FSDataset.java:2067) 在 org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:799) 在 org.apache.hadoop.hdfs.server.datanode.DataNode .run(DataNode.java:1471) 在 java.lang.Thread.run(Thread.java:636)
2012-08-03 17:47:33,988 警告 org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:AsyncDiskService 已经关闭。2012-08-03 17:47:33,989 信息 org.apache.hadoop.hdfs.server.datanode.DataNode:退出 Datanode