1

我正在用纱线做一个 BFS 算法,我为我的顶点(顶点数据)上的数据创建了一个自定义值。但是,在我这样做之后,读取边缘的过程出现了问题。

我将错误追溯到以下代码行:

  • 在 ByteArrayEdges 中,变量serializedEdgesBytesUsed获取值1987015248并在分配新数组时给出 OutOfMemory 错误(据我所知,java 限制为 64K)

    @Override
    public void readFields(DataInput in) throws IOException {
    serializedEdgesBytesUsed = in.readInt();
    if (serializedEdgesBytesUsed > 0) {
      // Only create a new buffer if the old one isn't big enough
      if (serializedEdges == null ||
          serializedEdgesBytesUsed > serializedEdges.length) {
        serializedEdges = new byte[serializedEdgesBytesUsed];
      }
      in.readFully(serializedEdges, 0, serializedEdgesBytesUsed);
    }
    edgeCount = in.readInt();
    

    }

我不确定为什么会发生这种情况,但在使用自定义顶点数据之前,这个问题不存在。

完整的日志在这里(我直接从 Eclipse 进行测试,因为在伪分布式集群中要困难得多):

2015-08-20 01:52:21,103 INFO  [LocalJobRunner Map Task Executor #0] utils.ProgressableUtils (ProgressableUtils.java:waitFor(315)) - waitFor: Future result not ready yet java.util.concurrent.FutureTask@b2dd686
2015-08-20 01:52:21,103 INFO  [LocalJobRunner Map Task Executor #0] utils.ProgressableUtils (ProgressableUtils.java:waitFor(197)) - waitFor: Waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
2015-08-20 01:53:12,527 ERROR [LocalJobRunner Map Task Executor #0] graph.GraphMapper (GraphMapper.java:run(101)) - Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
    at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193)
    at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151)
    at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136)
    at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99)
    at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233)
    at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:316)
    at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:409)
    at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:629)
    at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284)
    at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:202)
    at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:312)
    at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:185)
    ... 17 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.giraph.edge.ByteArrayEdges.readFields(ByteArrayEdges.java:193)
    at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:541)
    at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98)
    at org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99)
    at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:115)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
    at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
    ... 4 more
2015-08-20 01:53:12,532 ERROR [LocalJobRunner Map Task Executor #0] worker.BspServiceWorker (BspServiceWorker.java:unregisterHealth(777)) - unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_local1113753160_0001/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/localhost_0 on superstep -1
2015-08-20 01:53:12,558 INFO  [Thread-13] mapred.LocalJobRunner (LocalJobRunner.java:runTasks(456)) - map task executor complete.
2015-08-20 01:53:12,562 WARN  [Thread-13] mapred.LocalJobRunner (LocalJobRunner.java:run(560)) - job_local1113753160_0001
java.lang.Exception: java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
    at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:104)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@6e5efd25
    at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193)
    at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151)
    at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136)
    at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99)
    at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233)
    at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:316)
    at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:409)
    at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:629)
    at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284)
    at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
    ... 8 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:202)
    at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:312)
    at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:185)
    ... 17 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.giraph.edge.ByteArrayEdges.readFields(ByteArrayEdges.java:193)
    at org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:541)
    at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98)
    at org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99)
    at org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:115)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
    at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
    at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
    at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
    ... 4 more

用于执行此操作的终端行是:

$HADOOP_HOME/bin/yarn jar $GIRAPH_HOME/gaph-examples/target/giraph-examples-1.1.0-for-hadoop-2.4.0-jar-with-dependencies.jar algoritmos.masivos.BusquedaDeCaminosNavegacionalesWikiquotesMasivo lectura_de_grafo.BusquedaDeCaminosNavegacionalesWikiquote -vif pruebas.IdTextWithValueDoubleInputFormat -vip /user/hduser/input/wiki-graph-chiquito.txt -vof pruebas.IdTextWithValueTextOutputFormat -op /user/hduser/output/caminosNavegacionales -w 2 -yh 250

也许我应该使用一个EdgeInputFormat

谢谢阅读。

4

1 回答 1

1

I see the actual problem as the insufficient memory allocated to the Maptask container which causes Java heap space error.

To fix this quickly you may prefer expanding the memory container of the yarn map/reduce nodes by allocating more memory in the configurations.

Please prefer allocating more memory for the following set of properties in the yarn-site.xml.

mapreduce.map.memory.mb
mapreduce.reduce.memory.mb

mapreduce.map.java.opts
mapreduce.reduce.java.opts

[Note: the *.memory.mb properties should be higher than *.java.opts properties]

于 2015-08-25T14:11:03.463 回答