0

我用 7 gig wiki 备份运行 mahout wikipedia 示例..,但是在测试分类器时,我得到了 OutOfMemory 错误

我粘贴了下面的输出,我将 mahout 堆大小和 java 堆大小设置为 2500m

$MAHOUT_HOME/bin/mahout testclassifier -m wikipediamodel -d wikipediainput

run with heapsize 2500
-Xmx2500m
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.



Running on hadoop, using HADOOP_HOME=/home/hduser/hadoop/hadoop
No HADOOP_CONF_DIR set, using /home/hduser/hadoop/hadoop/conf 
MAHOUT-JOB: /home/nauman/mahout/examples/target/mahout-examples-0.7-SNAPSHOT-job.jar
12/04/10 00:06:18 INFO common.HadoopUtil: Deleting wikipediainput-output
12/04/10 00:06:18 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
12/04/10 00:06:18 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. 

Applications should implement Tool for the same.

12/04/10 00:06:18 INFO mapred.FileInputFormat: Total input paths to process : 1
12/04/10 00:06:18 INFO mapred.JobClient: Running job: job_local_0001
12/04/10 00:06:18 INFO mapred.FileInputFormat: Total input paths to process : 1
12/04/10 00:06:18 INFO mapred.MapTask: numReduceTasks: 1
12/04/10 00:06:18 INFO mapred.MapTask: io.sort.mb = 100
12/04/10 00:06:19 INFO mapred.MapTask: data buffer = 79691776/99614720
12/04/10 00:06:19 INFO mapred.MapTask: record buffer = 262144/327680
12/04/10 00:06:19 INFO bayes.BayesClassifierMapper: Bayes Parameter {basePath=wikipediamodel, classifierType=bayes, dataSource=hdfs, alpha_i=1.0, gramSize=1, verbose=false, encoding=UTF-8, confusionMatrix=null, defaultCat=unknown, testDirPath=wikipediainput}
12/04/10 00:06:19 INFO bayes.BayesClassifierMapper: {basePath=wikipediamodel, classifierType=bayes, dataSource=hdfs, alpha_i=1.0, gramSize=1, verbose=false, encoding=UTF-8, confusionMatrix=null, defaultCat=unknown, testDirPath=wikipediainput}
12/04/10 00:06:19 INFO bayes.BayesClassifierMapper: Testing Bayes Classifier
12/04/10 00:06:19 INFO mapred.JobClient:  map 0% reduce 0%
12/04/10 00:06:20 INFO bayes.SequenceFileModelReader: Read 50000 feature weights
12/04/10 00:06:20 INFO bayes.SequenceFileModelReader: Read 100000 feature weights
12/04/10 00:06:21 INFO bayes.SequenceFileModelReader: Read 150000 feature weights
12/04/10 00:06:21 INFO bayes.SequenceFileModelReader: Read 200000 feature weights
12/04/10 00:06:21 INFO bayes.SequenceFileModelReader: Read 250000 feature weights
12/04/10 00:06:21 INFO mapred.LocalJobRunner: file:/home/nauman/wikipediainput/part-r-00000:0+33554432
12/04/10 00:06:22 INFO bayes.SequenceFileModelReader: Read 300000 feature weights
12/04/10 00:06:22 INFO bayes.SequenceFileModelReader: Read 350000 feature weights
12/04/10 00:06:22 INFO bayes.SequenceFileModelReader: Read 400000 feature weights
12/04/10 00:06:22 INFO bayes.SequenceFileModelReader: Read 450000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 500000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 550000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 600000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 650000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 700000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 750000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 800000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 850000 feature weights
12/04/10 00:06:24 INFO bayes.SequenceFileModelReader: Read 900000 feature weights
12/04/10 00:06:25 INFO bayes.SequenceFileModelReader: Read 950000 feature weights
12/04/10 00:06:25 INFO bayes.SequenceFileModelReader: Read 1000000 feature weights
12/04/10 00:06:25 INFO bayes.SequenceFileModelReader: Read 1050000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1100000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1150000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1200000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1250000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1300000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1350000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1400000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1450000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1500000 feature weights
12/04/10 00:06:27 INFO bayes.SequenceFileModelReader: Read 1550000 feature weights
12/04/10 00:06:27 INFO bayes.SequenceFileModelReader: Read 1600000 feature weights
12/04/10 00:06:27 INFO bayes.SequenceFileModelReader: Read 1650000 feature weights
12/04/10 00:06:27 INFO bayes.SequenceFileModelReader: Read 1700000 feature weights
12/04/10 00:06:27 INFO bayes.SequenceFileModelReader: Read 1750000 feature weights
12/04/10 00:06:27 INFO bayes.SequenceFileModelReader: Read 1800000 feature weights
12/04/10 00:06:28 INFO bayes.SequenceFileModelReader: Read 1850000 feature weights
12/04/10 00:06:28 INFO bayes.SequenceFileModelReader: Read 1900000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 1950000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2000000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2050000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2100000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2150000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2200000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2250000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2300000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2350000 feature weights
12/04/10 00:06:31 INFO bayes.SequenceFileModelReader: Read 2400000 feature weights
12/04/10 00:06:31 INFO bayes.SequenceFileModelReader: Read 2450000 feature weights
12/04/10 00:06:31 INFO bayes.SequenceFileModelReader: Read 2500000 feature weights
12/04/10 00:06:31 INFO bayes.SequenceFileModelReader: Read 2550000 feature weights
12/04/10 00:06:31 INFO bayes.SequenceFileModelReader: Read 2600000 feature weights
12/04/10 00:06:31 INFO bayes.SequenceFileModelReader: Read 2650000 feature weights
12/04/10 00:06:31 INFO bayes.SequenceFileModelReader: Read 2700000 feature weights
12/04/10 00:06:32 INFO bayes.SequenceFileModelReader: Read 2750000 feature weights
12/04/10 00:06:32 INFO bayes.SequenceFileModelReader: Read 2800000 feature weights
12/04/10 00:06:32 INFO bayes.SequenceFileModelReader: Read 2850000 feature weights
12/04/10 00:06:32 INFO bayes.SequenceFileModelReader: Read 2900000 feature weights
12/04/10 00:06:32 INFO bayes.SequenceFileModelReader: Read 2950000 feature weights
12/04/10 00:06:32 INFO bayes.SequenceFileModelReader: Read 3000000 feature weights
12/04/10 00:06:32 INFO bayes.SequenceFileModelReader: Read 3050000 feature weights
12/04/10 00:06:33 INFO bayes.SequenceFileModelReader: Read 3100000 feature weights
12/04/10 00:06:33 INFO bayes.SequenceFileModelReader: Read 3150000 feature weights
12/04/10 00:06:33 INFO bayes.SequenceFileModelReader: Read 3200000 feature weights
12/04/10 00:06:33 INFO bayes.SequenceFileModelReader: Read 3250000 feature weights
12/04/10 00:06:33 INFO bayes.SequenceFileModelReader: Read 3300000 feature weights
12/04/10 00:06:33 INFO bayes.SequenceFileModelReader: Read 3350000 feature weights
12/04/10 00:06:33 INFO bayes.SequenceFileModelReader: Read 3400000 feature weights
12/04/10 00:06:34 INFO bayes.SequenceFileModelReader: Read 3450000 feature weights
12/04/10 00:06:34 INFO bayes.SequenceFileModelReader: Read 3500000 feature weights
12/04/10 00:06:34 INFO bayes.SequenceFileModelReader: Read 3550000 feature weights
12/04/10 00:06:34 INFO bayes.SequenceFileModelReader: Read 3600000 feature weights
12/04/10 00:06:39 INFO bayes.SequenceFileModelReader: Read 3650000 feature weights
12/04/10 00:06:39 INFO bayes.SequenceFileModelReader: Read 3700000 feature weights
12/04/10 00:06:40 INFO bayes.SequenceFileModelReader: Read 3750000 feature weights
12/04/10 00:06:40 INFO bayes.SequenceFileModelReader: Read 3800000 feature weights
12/04/10 00:06:42 INFO bayes.SequenceFileModelReader: Read 3850000 feature weights
12/04/10 00:06:42 INFO bayes.SequenceFileModelReader: Read 3900000 feature weights
12/04/10 00:06:42 INFO bayes.SequenceFileModelReader: Read 3950000 feature weights
12/04/10 00:06:42 INFO bayes.SequenceFileModelReader: Read 4000000 feature weights
12/04/10 00:06:42 INFO bayes.SequenceFileModelReader: Read 4050000 feature weights
12/04/10 00:06:44 INFO bayes.SequenceFileModelReader: Read 4100000 feature weights
12/04/10 00:06:44 INFO bayes.SequenceFileModelReader: Read 4150000 feature weights
12/04/10 00:06:45 INFO bayes.SequenceFileModelReader: Read 4200000 feature weights
12/04/10 00:06:45 INFO bayes.SequenceFileModelReader: Read 4250000 feature weights
12/04/10 00:06:45 INFO bayes.SequenceFileModelReader: Read 4300000 feature weights
12/04/10 00:06:47 INFO bayes.SequenceFileModelReader: Read 4350000 feature weights
12/04/10 00:06:47 INFO bayes.SequenceFileModelReader: Read 4400000 feature weights
12/04/10 00:06:47 INFO bayes.SequenceFileModelReader: Read 4450000 feature weights
12/04/10 00:06:47 INFO bayes.SequenceFileModelReader: Read 4500000 feature weights
12/04/10 00:06:50 INFO bayes.SequenceFileModelReader: Read 4550000 feature weights
12/04/10 00:06:50 INFO bayes.SequenceFileModelReader: Read 4600000 feature weights
12/04/10 00:06:51 INFO bayes.SequenceFileModelReader: Read 4650000 feature weights
12/04/10 00:06:51 INFO bayes.SequenceFileModelReader: Read 4700000 feature weights
12/04/10 00:06:53 INFO bayes.SequenceFileModelReader: Read 4750000 feature weights
12/04/10 00:06:53 INFO bayes.SequenceFileModelReader: Read 4800000 feature weights
12/04/10 00:06:53 INFO bayes.SequenceFileModelReader: Read 4850000 feature weights
12/04/10 00:06:56 INFO bayes.SequenceFileModelReader: Read 4900000 feature weights
12/04/10 00:06:56 INFO bayes.SequenceFileModelReader: Read 4950000 feature weights
12/04/10 00:06:56 INFO bayes.SequenceFileModelReader: Read 5000000 feature weights
12/04/10 00:06:59 INFO bayes.SequenceFileModelReader: Read 5050000 feature weights
12/04/10 00:06:59 INFO bayes.SequenceFileModelReader: Read 5100000 feature weights
12/04/10 00:06:59 INFO bayes.SequenceFileModelReader: Read 5150000 feature weights
12/04/10 00:07:01 INFO bayes.SequenceFileModelReader: Read 5200000 feature weights
12/04/10 00:07:02 INFO bayes.SequenceFileModelReader: Read 5250000 feature weights
12/04/10 00:07:02 INFO bayes.SequenceFileModelReader: Read 5300000 feature weights
12/04/10 00:07:04 INFO bayes.SequenceFileModelReader: Read 5350000 feature weights
12/04/10 00:07:04 INFO bayes.SequenceFileModelReader: Read 5400000 feature weights
12/04/10 00:07:07 INFO bayes.SequenceFileModelReader: Read 5450000 feature weights
12/04/10 00:07:07 INFO bayes.SequenceFileModelReader: Read 5500000 feature weights
12/04/10 00:07:10 INFO bayes.SequenceFileModelReader: Read 5550000 feature weights
12/04/10 00:07:12 INFO bayes.SequenceFileModelReader: Read 5600000 feature weights
12/04/10 00:07:12 INFO bayes.SequenceFileModelReader: Read 5650000 feature weights
12/04/10 00:07:15 INFO bayes.SequenceFileModelReader: Read 5700000 feature weights
12/04/10 00:07:17 INFO bayes.SequenceFileModelReader: Read 5750000 feature weights
12/04/10 00:07:20 INFO bayes.SequenceFileModelReader: Read 5800000 feature weights
12/04/10 00:07:23 INFO bayes.SequenceFileModelReader: Read 5850000 feature weights
12/04/10 00:07:25 INFO bayes.SequenceFileModelReader: Read 5900000 feature weights
12/04/10 00:07:28 INFO bayes.SequenceFileModelReader: Read 5950000 feature weights
12/04/10 00:07:33 INFO bayes.SequenceFileModelReader: Read 6000000 feature weights
12/04/10 00:07:38 INFO bayes.SequenceFileModelReader: Read 6050000 feature weights
12/04/10 00:07:46 INFO bayes.SequenceFileModelReader: Read 6100000 feature weights
12/04/10 00:08:04 INFO bayes.SequenceFileModelReader: Read 6150000 feature weights
12/04/10 00:08:20 INFO bayes.SequenceFileModelReader: Read 6200000 feature weights
12/04/10 00:08:47 INFO bayes.SequenceFileModelReader: Read 6250000 feature weights
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at sun.misc.FloatingDecimal.toJavaFormatString(FloatingDecimal.java:887)
    at java.lang.Double.toString(Double.java:179)
    at java.text.DigitList.set(DigitList.java:272)
    at java.text.DecimalFormat.format(DecimalFormat.java:584)
    at java.text.DecimalFormat.format(DecimalFormat.java:507)
    at java.text.NumberFormat.format(NumberFormat.java:269)
    at org.apache.hadoop.util.StringUtils.formatPercent(StringUtils.java:119)
    at org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:1283)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1251)
    at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierDriver.runJob(BayesClassifierDriver.java:87)
    at org.apache.mahout.classifier.bayes.TestClassifier.classifyParallel(TestClassifier.java:288)
    at org.apache.mahout.classifier.bayes.TestClassifier.main(TestClassifier.java:191)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
    at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
12/04/10 00:17:15 WARN mapred.LocalJobRunner: job_local_0001
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 5 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 10 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 13 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at java.nio.HeapCharBuffer.<init>(HeapCharBuffer.java:39)
    at java.nio.CharBuffer.allocate(CharBuffer.java:312)
    at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:760)
    at org.apache.hadoop.io.Text.decode(Text.java:350)
    at org.apache.hadoop.io.Text.decode(Text.java:327)
    at org.apache.hadoop.io.Text.toString(Text.java:254)
    at org.apache.mahout.common.StringTuple.readFields(StringTuple.java:143)
    at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1836)
    at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876)
    at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
    at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38)
    at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
    at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
    at com.google.common.collect.Iterators$5.hasNext(Iterators.java:525)
    at com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadFeatureWeights(SequenceFileModelReader.java:72)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:46)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
    at org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
    at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:121)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
4

3 回答 3

1

您需要增加映射器可用的内存。设置mapred.map.java.child.opts一个足够大的东西来容纳模型。

可能是您试图将一些不切实际的大数据加载到内存中。

于 2012-07-11T09:41:04.720 回答
0

我也遇到了同样的问题,经过一番锻炼后,我尝试通过设置 mahout opts 来增加 JVM 内存。尝试这个:

“export MAVEN_OPTS=-Xmx1g”之类的会给JVM更多的内存。

尝试发布结果,因为我认为很多人都面临这个问题。

于 2012-04-10T02:20:54.840 回答
-1

如果您在一台机器上执行此操作,减少 hadoop 记录大小的大小也可能会有所帮助,因为它会增加映射任务的数量

于 2012-07-11T08:20:36.900 回答