我正在尝试从 mahout 在 EMR 上运行 k-means 算法。输入矢量化数据位于 S3。
我的命令:
elastic-mapreduce --jar s3://mybucket/dir/mahout-examples-0.8-SNAPSHOT-job.jar --main-class org.apache.mahout.driver.MahoutDriver --arg kmeans --arg -i --arg s3://mybucket/dir/normalized-bigram/tfidf-vectors --arg -c --arg /dir/kmeans-centroids --arg -o --arg /dir/kmeans-clusters --arg -cd --arg 1.0 --arg -k --arg 20 --arg -x --arg 20 --arg -cl --arg -dm --arg org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure -j $JOBID --step-name "kmeans-euclidean-distance-measure"
我有例外:
Exception in thread "main" java.lang.IllegalArgumentException: This file system object (hdfs://10.229.34.19:9000) does not support access to the request path 's3://hdp.wtest/wiki/normalized-bigram/tfidf-vectors' You possibly called FileSystem.get(conf) when you should have called FileSystem.get(uri, conf) to obtain a file system supporting your path.
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:384)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:129)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:513)
at org.apache.mahout.clustering.kmeans.RandomSeedGenerator.buildRandom(RandomSeedGenerator.java:71)
at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:94)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:48)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:187)
我的命令有什么问题?