4

我在尝试在 Hadoop 中启动数据节点时遇到了一些问题,从日志中我可以看到数据节点启动了两次(部分日志如下):

2012-05-22 16:25:00,369 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = master/192.168.0.1
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.0.1
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1243785; compiled by 'hortonfo' on Tue Feb 14 08:15:38 UTC 2012
************************************************************/
2012-05-22 16:25:00,375 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = master/192.168.0.1
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.0.1
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1243785; compiled by 'hortonfo' on Tue Feb 14 08:15:38 UTC 2012
************************************************************/
2012-05-22 16:25:00,490 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2012-05-22 16:25:00,500 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2012-05-22 16:25:00,500 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2012-05-22 16:25:00,500 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2012-05-22 16:25:00,512 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2012-05-22 16:25:00,523 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2012-05-22 16:25:00,523 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2012-05-22 16:25:00,524 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2012-05-22 16:25:00,722 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2012-05-22 16:25:00,724 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2012-05-22 16:25:00,727 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2012-05-22 16:25:00,729 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2012-05-22 16:20:15,894 INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage /app/hadoop/tmp/dfs/data. The directory is already locked.
2012-05-22 16:20:16,008 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Cannot lock storage /app/hadoop/tmp/dfs/data. The directory is already locked.
        at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:602)
        at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:455)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:111)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:385)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)

我在网上搜索过,发现了这个问题,但是我没有使用 conf/hdfs-site.xml 覆盖任何内容,如下所示,因此 Hadoop 应该使用默认值(如此所述)不会导致任何失败的锁定。这是我的 conf/hdfs-site.xml:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>2</value>
    <description>Default block replication.
    The actual number of replications can be specified when the file is created.
    The default is used if replication is not specified in create time.
    </description>
  </property>
</configuration>

这是我的 conf/core-site.xml:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/app/hadoop/tmp</value>
    <description>A base for other temporary directories.</description>
  </property>

  <property>
    <name>fs.default.name</name>
    <value>hdfs://master:54310</value>
    <description>The name of the default file system.  A URI whose
    scheme and authority determine the FileSystem implementation.  The
    uri's scheme determines the config property (fs.SCHEME.impl) naming
    the FileSystem implementation class.  The uri's authority is used to
    determine the host, port, etc. for a filesystem.</description>
  </property>
</configuration>

这是 hadoop/conf/slaves 的内容:

master
slave
4

5 回答 5

4
  • 停止数据节点
  • 从 dfs 数据目录中删除 in_use.lock 文件
  • 位置和启动数据节点

它应该可以正常工作

于 2013-08-07T19:16:17.140 回答
1

还要在 hdfs-site.xml 文件中添加以下 2 个属性。

<property>
        <name>dfs.name.dir</name>
        <value>/some_path</value>
    </property>

    <property>
        <name>dfs.data.dir</name>
        <value>/some_path</value>
    </property>

它们的默认位置是 /tmp..因此,您在每次重新启动时都会丢失数据。

于 2012-05-29T22:31:01.607 回答
1

我遇到了类似的问题,但我读到一篇文章说 dfs.name.dir 和 dfs.data.dir 应该彼此不同。我让两者相同,并将这些值更改为彼此不同解决了我的问题。

于 2013-02-02T19:44:15.750 回答
0

我遇到了同样的问题,并在 shell 中将 umask 设置为 0022,在那里我运行测试为我解决了这个问题。

于 2013-06-17T15:53:11.047 回答
0
sudo rm -r /tmp/hadoop-[YourNameHere]/dfs/data/

并重新启动 dfs。

于 2016-07-25T00:02:45.783 回答