2

我们有 CDH 5.2 和 Cloudera Manager 5。

我们要将数据从 nameservice2 复制到 nameservice1

两个集群都在同一个 CDH 版本上

当我尝试hadoop distcp hdfs://nameservice2/foo/bar hdfs://nameservice1/bar/foo

我有错误

java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice2

所以我将以下配置从 Nameservice2 添加到 Nameservice1

Cloudera 管理器(网关默认组)中 hdfs-site.xml 的 HDFS 客户端高级配置片段(安全阀)

<property>
<name>dfs.nameservices</name>
<value>nameservices2</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.nameservices2</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.namenodes.nameservices2</name>
<value>namenode36,namenode405</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nameservices2.namenode36</name>
<value>hnn001.prod.cc:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.nameservices2.namenode36</name>
<value>hnn001.prod.com:54321</value>
</property>
<property>
<name>dfs.namenode.http-address.nameservices2.namenode36</name>
<value>hnn001.prod.com:50070</value>
</property>
<property>
<name>dfs.namenode.https-address.nameservices2.namenode36</name>
<value>hnn001.prod.com:50470</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:54321</value>
</property>
<property>
<name>dfs.namenode.http-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:50070</value>
</property>
<property>
<name>dfs.namenode.https-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:50470</value>
</property>

但我仍然遇到同样的错误。

有什么解决方法吗?

谢谢

4

1 回答 1

6

在启用 HA 的 HDFS 中,namenode nameservice1,nameservice2 是逻辑名称,您不能将端口与该逻辑名称一起使用。

你有两种方法。

简单的方法是找到活动的名称节点并在 distcp 命令中使用活动的名称节点:端口,如下所示。Namenode Web UI 可用于查找两个集群的活动名称节点。

hadoop distcp hdfs://hnn001.prod.cc:8020:8020/foo/bar hdfs://<dest-cluster-active-nn-hostname>:8020/bar/foo

另一种方法是使用两个集群的逻辑名称,如下所示,但在尝试以下命令之前,请确保您已在客户端 hdfs-site.xml 中正确配置了 nameservice1 和 nameservice2。

hadoop distcp hdfs://nameservice2/foo/bar hdfs://nameservice1/bar/foo

在本地集群中确认远程集群的名称服务。

看起来 nameservice2 是您的本地,而 nameservice1 是您的远程。您需要将 nameservice1 和 nameservice2 的所有关联属性保留在本地集群中,即。您本地集群的客户端 hdfs-site.xml 文件应如下所示。

<configuration>
<!-- Available nameservices -->
<property>
<name>dfs.nameservices</name>
<value>nameservices1,nameservices2</value>
</property>

<!-- Local nameservice2 properties -->
<property>
<name>dfs.client.failover.proxy.provider.nameservices2</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.namenodes.nameservices2</name>
<value>namenode36,namenode405</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nameservices2.namenode36</name>
<value>hnn001.prod.cc:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.nameservices2.namenode36</name>
<value>hnn001.prod.com:54321</value>
</property>
<property>
<name>dfs.namenode.http-address.nameservices2.namenode36</name>
<value>hnn001.prod.com:50070</value>
</property>
<property>
<name>dfs.namenode.https-address.nameservices2.namenode36</name>
<value>hnn001.prod.com:50470</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:54321</value>
</property>
<property>
<name>dfs.namenode.http-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:50070</value>
</property>
<property>
<name>dfs.namenode.https-address.nameservices2.namenode405</name>
<value>hnn002.prod.com:50470</value>
</property>

<!-- Remote nameservice1 properties -->
<!-- You can find these properties in the remote machine's hdfs-site.xml file -->

<property>
<name>dfs.client.failover.proxy.provider.nameservices1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.namenodes.nameservices1</name>
<value>namenodeXX,namenodeYY</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nameservices1.namenodeXX</name>
<value><Remote-nn1>:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.nameservices1.namenodeXX</name>
<value><Remote-nn1>:54321</value>
</property>
<property>
<name>dfs.namenode.http-address.nameservices1.namenode**XX**</name>
<value><Remote-nn1>:50070</value>
</property>
<property>
<name>dfs.namenode.https-address.nameservices1.namenodeXX</name>
<value><Remote-nn1>:50470</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nameservices1.namenodeYY</name>
<value><Remote-nn2>:8020</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.nameservices1.namenodeYY</name>
<value><Remote-nn2>:54321</value>
</property>
<property>
<name>dfs.namenode.http-address.nameservices1.namenodeYY</name>
<value><Remote-nn2>:50070</value>
</property>
<property>
<name>dfs.namenode.https-address.nameservices1.namenodeYY</name>
<value><Remote-nn2>:50470</value>
</property>

<!-- Other properties --> 

</configuration>

在上述配置文件中,将所有占位符(如 YY XX)替换为远程机器的 hdfs site.xml 中的相应值。

于 2014-11-11T09:30:57.410 回答