I tried to run naive byes algorithm using 1 master(small) and 1 slave(small) node on EMR. I successfully completed steps using seqdirectory, seq2sparse and split commands. But during training phase I got errors. I used following command to train the algorithm:
./elastic-mapreduce --jar s3n://<bucket name>/mahout/mahout-examples-0.7-job.jar \
--main-class org.apache.mahout.driver.MahoutDriver \
--logs \
--arg trainnb \
--arg -i --arg /<folder name>/mahout/review-train-vectors/ --arg -el\
--arg -o --arg /<folder name>/mahout/model/ \
--arg -li --arg /<folder name>/mahout/labelindex/ \
--arg -ow \
-j <job-name>
Here's the log of the job step:
java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:76)
at org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
attempt_201302130846_0035_m_000000_0: SLF4J: Class path contains multiple SLF4J bindings.
attempt_201302130846_0035_m_000000_0: SLF4J: Found binding in [jar:file:/home/hadoop /lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_0: SLF4J: Found binding in [jar:file:/mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201302130846_0035/jars/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_0: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:76)
at org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
attempt_201302130846_0035_m_000000_1: SLF4J: Class path contains multiple SLF4J bindings.
attempt_201302130846_0035_m_000000_1: SLF4J: Found binding in [jar:file:/home/hadoop /lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_1: SLF4J: Found binding in [jar:file:/mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201302130846_0035/jars/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_1: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:76)
at org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
attempt_201302130846_0035_m_000000_2: SLF4J: Class path contains multiple SLF4J bindings.
attempt_201302130846_0035_m_000000_2: SLF4J: Found binding in [jar:file:/home/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_2: SLF4J: Found binding in [jar:file:/mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201302130846_0035/jars/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201302130846_0035_m_000000_2: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
Anyone tried this thing before? Please help me to resolve this issue. I also have same issue when I run this algorithm using hadoop pseudo-distributed mode on my local system. This algorithm works only with MAHOUT_LOCAL=True environment variable.