1

I tried to run naive byes algorithm using 1 master(small) and 1 slave(small) node on EMR. I successfully completed steps using seqdirectory, seq2sparse and split commands. But during training phase I got errors. I used following command to train the algorithm:

./elastic-mapreduce --jar s3n://<bucket name>/mahout/mahout-examples-0.7-job.jar \
    --main-class org.apache.mahout.driver.MahoutDriver \
    --logs \
    --arg trainnb \
    --arg -i --arg /<folder name>/mahout/review-train-vectors/ --arg -el\
    --arg -o --arg /<folder name>/mahout/model/ \
    --arg -li --arg /<folder name>/mahout/labelindex/ \
    --arg -ow \
    -j <job-name>

Here's the log of the job step:

java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:76)
at org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

attempt_201302130846_0035_m_000000_0: SLF4J: Class path contains multiple SLF4J bindings.
attempt_201302130846_0035_m_000000_0: SLF4J: Found binding in [jar:file:/home/hadoop  /lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_0: SLF4J: Found binding in [jar:file:/mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201302130846_0035/jars/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_0: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:76)
at org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

attempt_201302130846_0035_m_000000_1: SLF4J: Class path contains multiple SLF4J bindings.
attempt_201302130846_0035_m_000000_1: SLF4J: Found binding in [jar:file:/home/hadoop  /lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_1: SLF4J: Found binding in [jar:file:/mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201302130846_0035/jars/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_1: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:76)
at org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

attempt_201302130846_0035_m_000000_2: SLF4J: Class path contains multiple SLF4J bindings.
attempt_201302130846_0035_m_000000_2: SLF4J: Found binding in [jar:file:/home/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_2: SLF4J: Found binding in [jar:file:/mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201302130846_0035/jars/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_2: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

Anyone tried this thing before? Please help me to resolve this issue. I also have same issue when I run this algorithm using hadoop pseudo-distributed mode on my local system. This algorithm works only with MAHOUT_LOCAL=True environment variable.

4

1 回答 1

1

命令的参数有问题。看起来您复制和粘贴命令而不根据您的环境进行调整:

  --jar s3n://<bucket name>/mahout/mahout-examples-0.7-job.jar

什么是存储桶名称?

 --arg -i --arg /<folder name>/mahout/review-train-vectors/

<folder name>看起来像一个变量,您应该根据自己的情况进行更改

-j <job-name>

同类型的错误。看来您不是经验丰富的 linux 用户,请注意\应跳过每行末尾的字符(很可能在您执行命令的网页上有。该页面可读性更好(您确定它是一个命令-多行命令不多):))

于 2013-02-16T15:57:37.190 回答