大家好,我正在尝试为 k-mean 聚类算法运行集群转储。它不工作。任何的想法?这是来自 psudo 模式集群上 Mahout in Action 的示例。
任何用于可视化集群转储输出或 K-mean 输出的工具或方法。
[186946@01HW534064 bin]$ ./mahout clusterdump -dt sequencefile -d /home/186946/reuters-vectors/dictionary.file-0-i reuters-fkmeans-clusters/clusters-3 -o /home/186946/clusters.txt -b 10 -n 10
Running on hadoop, using HADOOP_HOME=/home/186946/hadoop-0.20.2-cdh3u5
No HADOOP_CONF_DIR set, using /home/186946/hadoop-0.20.2-cdh3u5/src/conf
MAHOUT-JOB: /home/186946/mahout-0.5-cdh3u5/mahout-examples-0.5-cdh3u5-job.jar
MAHOUT-JOB: /home/186946/mahout-0.5-cdh3u5/mahout-examples-0.5-cdh3u5-job.jar
13/03/08 17:26:11 ERROR common.AbstractJob: Unexpected reuters-fkmeans-clusters/clusters-3 while processing Job-Specific Options:
usage: <command> [Generic Options] [Job-Specific Options]
Generic Options:
-archives <paths> comma separated archives to be unarchived
on the compute machines.
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-files <paths> comma separated files to be copied to the
map reduce cluster
-fs <local|namenode:port> specify a namenode
-jt <local|jobtracker:port> specify a job tracker
-libjars <paths> comma separated jar files to include in
the classpath.
-tokenCacheFile <tokensFile> name of the file with the tokens
Unexpected reuters-fkmeans-clusters/clusters-3 while processing Job-Specific
Options:
Usage:
[--seqFileDir <seqFileDir> --output <output> --substring <substring>
--numWords <numWords> --pointsDir <pointsDir> --dictionary <dictionary>
--dictionaryType <dictionaryType> --help --tempDir <tempDir> --startPhase
<startPhase> --endPhase <endPhase>]
Job-Specific Options:
--seqFileDir (-s) seqFileDir The directory containing Sequence
Files for the Clusters
--output (-o) output Optional output directory. Default
is to output to the console.
--substring (-b) substring The number of chars of the
asFormatString() to print
--numWords (-n) numWords The number of top terms to print
--pointsDir (-p) pointsDir The directory containing points
sequence files mapping input vectors
to their cluster. If specified,
then the program will output the
points associated with a cluster
--dictionary (-d) dictionary The dictionary file
--dictionaryType (-dt) dictionaryType The dictionary file type
(text|sequencefile)
--help (-h) Print out help
--tempDir tempDir Intermediate output directory
--startPhase startPhase First phase to run
--endPhase endPhase Last phase to run
13/03/08 17:26:11 INFO driver.MahoutDriver: Program took 133 ms
谢谢