0

根据此处此处提供的说明,我已经在 bitnami AMI ami-02fb006b 上安装了 mahout(以及其他几个 ami,否则我不会问这个问题):

我在尝试运行 ./examples/bin/build-reuters.sh 时总是卡住这是命令的输出:

> Please select a number to choose the corresponding clustering
> algorithm
> 1. kmeans clustering
> 2. lda clustering Enter your choice : 1 ok. You chose 1 and we'll use
> kmeans Clustering Downloading Reuters-21578   % Total    % Received %
> Xferd  Average Speed   Time    Time     Time  Current
>                                   Dload  Upload   Total   Spent   
> Left  Speed 100 7959k  100 7959k    0     0   294k      0  0:00:26 
> 0:00:26 --:--:--  305k  Extracting...  Running on hadoop, using
> HADOOP_HOME=/usr/local/hadoop-0.20.2  
> HADOOP_CONF_DIR=/usr/local/hadoop-0.20.2/conf  MAHOUT-JOB:
> /usr/local/mahout-0.4/examples/target/mahout-examples-0.6-SNAPSHOT-job.jar
> 11/08/16 20:10:25 WARN driver.MahoutDriver: No
> org.apache.lucene.benchmark.utils.ExtractReuters.props found on
> classpath, will use command-line arguments only
>     Deleting all files in mahout-work/reuters-out-tmp
>     11/08/16 20:10:30 INFO driver.MahoutDriver: Program took 4906 ms
>     MAHOUT_LOCAL is set, running locally    
>     CLASSPATH:
> :/usr/local/mahout-0.4/src/conf:/usr/local/hadoop-0.20.2/conf:/usr/lib/jvm/java-6-openjdk//lib/tools.jar:/usr/local/mahout-0.4/mahout-*.jar:/usr/local/mahout-0.4/examples/target/mahout-examples-0.6-SNAPSHOT-job.jar:/usr/local/mahout-0.4/mahout-examples-*-job.jar:/usr/local/mahout-0.4/lib/*.jar:/usr/local/mahout-0.4/examples/target/dependency/antlr-2.7.7.jar:/usr/local/mahout-0.4/examples/target/dependency/antlr-3.2.jar:/usr/local/mahout-0.4/examples/target/dependency/antlr-runtime-3.2.jar:/usr/local/mahout-0.4/examples/target/dependency/avro-1.4.0-cassandra-1.jar:/usr/local/mahout-0.4/examples/target/dependency/bson-2.5.jar:/usr/local/mahout-0.4/examples/target/dependency/cassandra-all-0.8.1.jar:/usr/local/mahout-0.4/examples/target/dependency/cassandra-thrift-0.8.1.jar:/usr/local/mahout-0.4/examples/target/dependency/cglib-nodep-2.2.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-beanutils-1.7.0.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-beanutils-core-1.8.0.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-cli-1.2.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-cli-2.0-mahout.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-codec-1.4.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-collections-3.2.1.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-compress-1.1.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-configuration-1.6.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-dbcp-1.4.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-digester-1.7.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-httpclient-3.0.1.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-lang-2.6.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-logging-1.1.1.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-math-2.2.jar:/usr/local/mahout-0.4/examples/target/dependency/commons-pool-1.5.6.jar:/usr/local/mahout-0.4/examples/target/dependency/concurrentlinkedhashmap-lru-1.1.jar:/usr/local/mahout-0.4/examples/target/dependency/easymock-3.0.jar:/usr/local/mahout-0.4/examples/target/dependency/google-collections-1.0-rc2.jar:/usr/local/mahout-0.4/examples/target/dependency/guava-r09.jar:/usr/local/mahout-0.4/examples/target/dependency/hadoop-core-0.20.203.0.jar:/usr/local/mahout-0.4/examples/target/dependency/hector-core-0.8.0-2.jar:/usr/local/mahout-0.4/examples/target/dependency/high-scale-lib-1.1.2.jar:/usr/local/mahout-0.4/examples/target/dependency/httpclient-4.0.1.jar:/usr/local/mahout-0.4/examples/target/dependency/httpcore-4.0.1.jar:/usr/local/mahout-0.4/examples/target/dependency/jackson-core-asl-1.8.2.jar:/usr/local/mahout-0.4/examples/target/dependency/jackson-mapper-asl-1.8.2.jar:/usr/local/mahout-0.4/examples/target/dependency/jakarta-regexp-1.4.jar:/usr/local/mahout-0.4/examples/target/dependency/jamm-0.2.2.jar:/usr/local/mahout-0.4/examples/target/dependency/jcommon-1.0.12.jar:/usr/local/mahout-0.4/examples/target/dependency/jetty-6.1.22.jar:/usr/local/mahout-0.4/examples/target/dependency/jetty-util-6.1.22.jar:/usr/local/mahout-0.4/examples/target/dependency/jfreechart-1.0.13.jar:/usr/local/mahout-0.4/examples/target/dependency/jline-0.9.94.jar:/usr/local/mahout-0.4/examples/target/dependency/json-simple-1.1.jar:/usr/local/mahout-0.4/examples/target/dependency/jul-to-slf4j-1.6.1.jar:/usr/local/mahout-0.4/examples/target/dependency/junit-4.8.2.jar:/usr/local/mahout-0.4/examples/target/dependency/libthrift-0.6.1.jar:/usr/local/mahout-0.4/examples/target/dependency/log4j-1.2.16.jar:/usr/local/mahout-0.4/examples/target/dependency/lucene-analyzers-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/lucene-benchmark-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/lucene-core-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/lucene-highlighter-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/lucene-memory-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/lucene-queries-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/lucene-xercesImpl-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/mahout-collections-1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/mahout-core-0.6-SNAPSHOT.jar:/usr/local/mahout-0.4/examples/target/dependency/mahout-core-0.6-SNAPSHOT-tests.jar:/usr/local/mahout-0.4/examples/target/dependency/mahout-integration-0.6-SNAPSHOT.jar:/usr/local/mahout-0.4/examples/target/dependency/mahout-math-0.6-SNAPSHOT.jar:/usr/local/mahout-0.4/examples/target/dependency/mahout-math-0.6-SNAPSHOT-tests.jar:/usr/local/mahout-0.4/examples/target/dependency/mongo-java-driver-2.5.jar:/usr/local/mahout-0.4/examples/target/dependency/objenesis-1.2.jar:/usr/local/mahout-0.4/examples/target/dependency/servlet-api-2.5-20081211.jar:/usr/local/mahout-0.4/examples/target/dependency/servlet-api-2.5.jar:/usr/local/mahout-0.4/examples/target/dependency/slf4j-api-1.6.1.jar:/usr/local/mahout-0.4/examples/target/dependency/slf4j-jcl-1.6.1.jar:/usr/local/mahout-0.4/examples/target/dependency/slf4j-log4j12-1.6.1.jar:/usr/local/mahout-0.4/examples/target/dependency/snakeyaml-1.6.jar:/usr/local/mahout-0.4/examples/target/dependency/solr-commons-csv-3.1.0.jar:/usr/local/mahout-0.4/examples/target/dependency/speed4j-0.9.jar:/usr/local/mahout-0.4/examples/target/dependency/stringtemplate-3.2.jar:/usr/local/mahout-0.4/examples/target/dependency/uncommons-maths-1.2.2.jar:/usr/local/mahout-0.4/examples/target/dependency/uuid-3.2.0.jar:/usr/local/mahout-0.4/examples/target/dependency/watchmaker-framework-0.6.2.jar:/usr/local/mahout-0.4/examples/target/dependency/watchmaker-swing-0.6.2.jar:/usr/local/mahout-0.4/examples/target/dependency/xml-apis-1.0.b2.jar:/usr/local/mahout-0.4/examples/target/dependency/xpp3_min-1.1.4c.jar:/usr/local/mahout-0.4/examples/target/dependency/xstream-1.3.1.jar
> SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found
> binding in
> [jar:file:/usr/local/mahout-0.4/examples/target/mahout-examples-0.6-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/usr/local/mahout-0.4/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/usr/local/mahout-0.4/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation. WARNING: org.apache.hadoop.metrics.jvm.EventCounter is
> deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in
> all the log4j.properties files. 11/08/16 20:10:32 INFO
> common.AbstractJob: Command line arguments: {--charset=UTF-8,
> --chunkSize=5, --endPhase=2147483647,
> --fileFilterClass=org.apache.mahout.text.PrefixAdditionFilter,
> --input=mahout-work/reuters-out, --keyPrefix=,
> --output=mahout-work/reuters-out-seqdir, --startPhase=0,
> --tempDir=temp} Exception in thread "main" java.io.IOException: Call
> to localhost/127.0.0.1:9000 failed on local exception:
> java.io.IOException: Broken pipe
>         at
> org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1033)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
>         at $Proxy1.getProtocolVersion(Unknown Source)
>         at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:364)
>         at
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
>         at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:208)
>         at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:175)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>         at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1310)
>         at
> org.apache.hadoop.fs.FileSystem.access$100(FileSystem.java:65)
>         at
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1328)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:226)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:109)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:210)
>         at
> org.apache.mahout.text.SequenceFilesFromDirectory.run(SequenceFilesFromDirectory.java:59)
>         at
> org.apache.mahout.text.SequenceFilesFromDirectory.run(SequenceFilesFromDirectory.java:110)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>         at
> org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:85)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>         at
> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>         at
> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
> Caused by: java.io.IOException: Broken pipe
>         at sun.nio.ch.FileDispatcher.write0(Native Method)
>         at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:122)
>         at sun.nio.ch.IOUtil.write(IOUtil.java:93)
>         at
> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:352)
>         at
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:55)
>         at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>         at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146)
>         at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107)
>         at
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>         at
> java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>         at java.io.DataOutputStream.flush(DataOutputStream.java:123)
>         at
> org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:746)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1011)
>         ... 25 more rmr: cannot remove mahout-work/reuters-out-seqdir:
> No such file or directory. put: File mahout-work/reuters-out-seqdir
> does not exist.

这是一个一致的错误,我在每次尝试安装时都会遇到它。

我该怎么做才能解决这个问题?

4

1 回答 1

0

这看起来像是来自 Hadoop 和/或 EC2 的错误。由于某种原因,Hadoop 工作人员无法相互写入数据。为什么,我不知道,但我可能猜想端口没有打开,即使在本地也是如此。

我一直直接使用 Amazon EMR。

也许您可以通过尝试另一个 M/R 作业来进行调试。据我所知,它与 Mahout 没有直接关系。

于 2011-08-17T09:21:19.233 回答