0

我在我的集​​群上运行 Mahout 0.7,该集群有 30 个节点(每个节点有 8 个核心 16G 内存),试图对 250000 SparseVector(300000) 进行集群。

如果我通过调整树冠参数(T1,T2)来聚类发现少量树冠中心,它会很好地工作。

超过一定数量的树冠中心,作业在减少阶段的 67% 时不断失败,并显示“错误:Java 堆空间”消息。

如果 K 的值增加,K-means 聚类也有相同的堆空间问题。

我听说树冠中心向量和 k 中心向量保存在每个映射器和减速器的内存中。那将是num of canopy center(or k) x sparsevector(300000 size) = 足以容纳 4g 内存的东西,这似乎还不错。

根据这里和其他地方以前的问题,我已经调高了我能找到的每个内存旋钮:

  • hadoop-env.sh:在namenode上将所有堆空间设置为16GB,在datanodes上设置为8GB。
  • mapred-site.xml:添加 mapred.{map, reduce}.child.java.opts 属性,并将其值设置为 -Xmx4000m
  • mapred-site.xml:更改 mapred.tasktracker.{map, reduce}.tasks.maximum 属性,并将它们的值从 8 降低到 4

而且问题一直存在。我一直在反对这个问题太久了——有人有什么建议吗?

完整的命令和输出如下所示:

    public static void main(String [] args) throws Exception{

    String ratingsPath = args[0];
    String outputPath = args[1];
    String T1 = args[2];
    String T2 = args[3];

    Configuration conf = new Configuration();       

    HadoopUtil.delete(conf, new Path(outputPath));

    CanopyDriver.run(conf, new Path(ratingsPath), new Path(outputPath), new ManhattanDistanceMeasure(), 
            Double.parseDouble(T1), Double.parseDouble(T2), true, 0.0, false);

}

这是我面临的错误消息:

Exception in thread "main" java.lang.InterruptedException: Canopy Job failed processing /MrBic/Output/SeedGeneration_predSample
at org.apache.mahout.clustering.canopy.CanopyDriver.buildClustersMR(CanopyDriver.java:363)
at org.apache.mahout.clustering.canopy.CanopyDriver.buildClusters(CanopyDriver.java:248)
at org.apache.mahout.clustering.canopy.CanopyDriver.run(CanopyDriver.java:155)
at org.apache.mahout.clustering.canopy.CanopyDriver.run(CanopyDriver.java:170)
at MrBicClusteringDriver.main(MrBicClusteringDriver.java:32)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

2013-06-12 10:56:00,825 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
at org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
at org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
at org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:139)
at org.apache.mahout.math.AbstractVector.assign(AbstractVector.java:560)
at org.apache.mahout.clustering.AbstractCluster.observe(AbstractCluster.java:275)
at org.apache.mahout.clustering.canopy.Canopy.<init>(Canopy.java:43)
at org.apache.mahout.clustering.canopy.CanopyClusterer.addPointToCanopies(CanopyClusterer.java:163)
at org.apache.mahout.clustering.canopy.CanopyReducer.reduce(CanopyReducer.java:47)
at org.apache.mahout.clustering.canopy.CanopyReducer.reduce(CanopyReducer.java:30)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
4

0 回答 0