问题标签 [webhdfs]

问问题

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

250 问题

0 投票

1 回答

9774 浏览

hadoop - WebHDFS 无法在安全的 hadoop 集群上运行

我正在尝试使用 Kerberos 保护我的 HDP2 Hadoop 集群。

到目前为止，Hdfs、Hive、Hbase、Hue Beeswax 和 Hue Job/task 浏览器工作正常；但是 Hue 的文件浏览器不工作，它回答：

我的hue.ini文件配置了所有security_enabled=true和其他相关参数集。

我相信问题出在 WebHDFS 上。

我尝试了http://hadoop.apache.org/docs/r1.0.4/webhdfs.html#Authentication给出的 curl 命令

答案：

我可以通过添加具有以下 curl 请求的用户来重现 Hue 的错误消息：

它回答：

WebHDFS 和 curl 之间似乎没有 Kerberos 协商。

我期待类似的东西：

知道可能出了什么问题吗？

我hdfs-site.xml在每个节点上都有：

2014-10-07T16:49:04.527

0 投票

1 回答

1732 浏览

hadoop - Hadoop: Multinode cluster only recognizes 2 live nodes out of 3 data nodes

I have setup mutlinode hadoop with 3 datanodes and 1 namenode using virtualbox on Ubuntu. My host system serves as NameNode (also datanode) and two VMs serve as DataNodes. My systems are:

192.168.1.5: NameNode (also datanode)
192.168.1.10: DataNode2
192.168.1.11: DataNode3

I am able to SSH all systems from each system. My hadoop/etc/hadoop/slaves on all systems have entry as:

hadoop/etc/hadoop/master on all systems have entry as: 192.168.1.5

All core-site.xml, yarn-site.xml, hdfs-site.xml, mapred-site.xml, hadoop-env.sh are same on machines except of missing entry for dfs.namenode.name.dir in hdfs-site.xml in both DataNodes. When I execute start-yarn.sh and start-dfs.sh from NameNode, all work fine and through JPS I am able to see all required services on all machines.

However when I want to check from namenode/dfshealth.html#tab-datanode and namenode:50070/dfshealth.html#tab-overview, both indicates only 2 datanodes.

tab-datanode shows NameNode and DataNode2 as active datanodes. DataNode3 is not displayed at all.

I checked all configuration files (mentioned xml, sh and slves/master) multiple times to make sure nothing is different on both datanodes.

Also etc/hosts file also contains all node's entry in all systems:

One thing I'll like mention is that I configured 1 VM 1st then I made clone of that. So both VMs have same configuration. So its more confusing why 1 datanode is shown but not the other one.

hadoop hdfs webhdfs

2014-10-12T12:51:29.753

0 投票

0 回答

202 浏览

hadoop - 有没有办法通过 webhdfs 在 hadoop 中提取整个目录？

我们有两个集群，我们的要求是将数据从一个集群拉到另一个集群。

我们唯一可用的选择是，通过 webhdfs 提取数据！！

但不幸的是，我们可以看到，通过 webhdfs，我们一次只能拉取一个文件，这也需要为每个文件执行两个命令。

我的直接问题是：有没有办法通过 webhdfs，我们可以提取整个目录数据？

有人可以帮我解决这个...

注意：由于安全原因，DISTCP 对我们来说不是一个可行的选择！！

hadoop webhdfs distcp

2014-10-13T05:28:29.767

0 投票

0 回答

722 浏览

hadoop - Hadoop：namenode/dfshealth.html#tab-datanode 和 namenode:50070/dfshealth.html#tab-overview 页面仅显示 3 个活动节点中的 2 个

我已经设置了一个完全分布式的hadoop系统ubuntu。我有我的主机系统，然后在上面安装了 2 个 VirtualBox。当我从主节点执行时，start-dfs.sh会在所有 3 个系统上启动。我可以看到使用. 但是当我尝试使用 Web UI 进行验证时，它只显示 2 个活动节点。安装在一个VM 上的其中一个根本不可见。用户界面有任何错误或其他问题吗？start-yarn.shdatanodejpsdatanode

hadoop hdfs hadoop-yarn webhdfs

2014-10-13T14:25:11.357

0 投票

1 回答

1076 浏览

hadoop - HUE 中的文件浏览器不工作，用户上传失败

当我尝试上传任何用户报告时看到的错误：

“获取用户组信息失败：org.apache.hadoop.security.authorize.AuthorizationException：用户：hadoop不允许冒充hue”

我不知道该怎么办...我在hue中更改了psuedo_distribution文件，并将代理组hdfs和hadoop添加到hadoop core-site.xml

请帮忙！

谢谢

hadoop hdfs hue webhdfs

2014-11-21T21:25:15.680

0 投票

1 回答

463 浏览

webhdfs - 没有名为 pywebhdfs.web hdfs 的模块

我已经在 5 个节点上安装了 pywebhdfs 并通过

蟒蛇帮助（'pywebhdfs'）

pywebhdfs 软件包的帮助：

名称 pywebhdfs

文件/../pywebhdfs/初始化.py

获取“没有名为 pywebhdfs.web hdfs 的模块”

欣赏和帮助谢谢

webhdfs

2014-12-01T14:44:34.547

0 投票

1 回答

581 浏览

hadoop - webhdfs中是否有相当于移动的东西

我想使用 webhdfs 将一个或多个文件从一个路径移动到另一个路径。我正在使用hadoop 1.3。是否存在这样的 REST 调用

hadoop webhdfs

2014-12-03T18:21:11.850

0 投票

0 回答

320 浏览

hadoop - HTTP/1.1 401 对 HttpFS 服务器的访问受到限制

bivm:/home/biadmin/Desktop # curl -i " http://bivm.ibm.com:14000/webhdfs/v1/tmp/newfile?op=OPEN " HTTP/1.1 401 对 HttpFS 服务器的访问受到限制。请先从 BigInsights 控制台获取正确的凭证。内容类型：文本/html；charset=iso-8859-1 缓存控制：必须重新验证，无缓存，无存储内容长度：1589 服务器：码头（6.1.x）

当我尝试使用WebHDFS REST API通过 HttpFS 服务器的调用（默认在端口 14000 上运行）访问集群中的文件时，出现上述错误。请指教。

hadoop curl webhdfs biginsights

2014-12-05T07:14:27.783

0 投票

2 回答

2441 浏览

apache-spark - Spark 与 Webhdfs/httpfs

我想通过 httpfs 或 Webhdfs 将 HDFS 中的文件读入 Spark。类似的东西

sc.textFile("webhdfs://myhost:14000/webhdfs/v1/path/to/file.txt")

或者，理想情况下，

sc.textFile("httpfs://myhost:14000/webhdfs/v1/path/to/file.txt")

有没有办法让 Spark 通过 Webhdfs/httpfs 读取文件？

apache-spark webhdfs

2014-12-08T22:11:51.903

0 投票

1 回答

1576 浏览

webhdfs - 从 WebHDFS 获取目录大小？

我看到 webhdfs 不支持目录大小。在 HDFS 中，我可以使用

有没有办法从 webHDFS 派生这个？我需要以编程方式执行此操作，而不是通过查看页面。

webhdfs

2014-12-09T15:35:26.077

1 2 3 4 5 6 7 8 9 10

问题标签 [webhdfs]

Reference