0

我是 Linux Hadoop 的新手。我正在寻找指导以使 Hadoop 启动并运行以编写 C++ 任务。我尝试使用教程以伪分布式模式安装 Hadoop:

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

它适用于 Java,但在运行 c++ wordcount 示例时出现此错误:

12/05/03 18:23:00 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://localhost/user/c1048267/books
    at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:190)
    at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:201)
    at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
    at org.apache.hadoop.mapred.pipes.Submitter.runJob(Submitter.java:248)
    at org.apache.hadoop.mapred.pipes.Submitter.run(Submitter.java:479)
    at org.apache.hadoop.mapred.pipes.Submitter.main(Submitter.java:494)

如果有特定的软件、硬件或配置要求,也请指导我。目前我使用的是 Ubuntu 10.4 64 位、Hadoop-0.20.2 和 Java_Sun_6。这个平台是否支持 Hadoop 管道?如果没有,请指导我。

4

1 回答 1

0

首先,您需要配置您的 HADOOP_CLASSPATH 以包含与管道相关的所有库,然后编译它们,为其创建一个 MAKEFILE:

CC = g++
HADOOP_INSTALL = /home/hadoop/hadoop
PLATFORM = Linux-i386-32
CPPFLAGS = -m32 -I$(HADOOP_INSTALL)/c++/$(PLATFORM)/include

wordcount: wordcount.cpp
$(CC) $(CPPFLAGS) $< -Wall -L$(HADOOP_INSTALL)/c++/$(PLATFORM)/lib -lhadooppipes \
-lhadooputils -lpthread -g -O2 -o $@

您需要在集群libsslg++的每台机器上执行此操作。要编译并运行 wordcount 示例,您可以运行:

make  wordcount

然后,您必须在HDFS中复制 /bin 目录的输出二进制文件:

hadoop dfs -mkdir bin                    
hadoop dfs -put  wordcount   bin/wordcount

运行程序:

hadoop pipes -D hadoop.pipes.java.recordreader=true  \ 
               -D hadoop.pipes.java.recordwriter=true \
               -input dft1  -output dft1-out  \
               -program bin/wordcount

我看到的第二件事是这样的:

See JobConf(Class) or JobConf#setJar(String). Exception in thread "main"    org.apache.hadoop.mapred.InvalidInputException: **Input path does not exist: hdfs://localhost/user/c1048267/books** at 

你确定你在 HDFS 中有那个目录吗?最好的祝愿

有关完整的指南,请参阅此链接

于 2013-04-20T02:16:59.727 回答