java - hadoop中明显的内存泄漏

Question

我正在运行的 hadoop 程序中有明显的内存泄漏。具体来说，我收到消息：ERROR GC overhead limit exceeded 随后是异常

attempt_201210041336_0765_m_0000000_1: Exception in thread "Tread for syncLogs" java.lang.OutOfMemoryError: GC overhead limit exceeded
attempt_201210041336_0765_m_0000000_1: at java.util.Vector.elements (Vector.java:292)
attempt_201210041336_0765_m_0000000_1: at org.apache.log4j.helpers.AppenderAtachableImpl.getAllAppenders(AppenderAttachableImpl.java:84
attempt_201210041336_0765_m_0000000_1: at org.apache.log4j.Category.getAllAppenders (Category.java:415)
attempt_201210041336_0765_m_0000000_1: at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:256)
attempt_201210041336_0765_m_0000000_1: at org.apache.hadoop.mapred.Child$3.run(Child.java:157)

我在初始试验中运行的数据集应该非常小，所以我不应该达到任何内存限制。更重要的是，我不想更改 hadoop 配置；如果程序不能以当前配置运行，则程序需要重写。

谁能帮我弄清楚如何诊断这个问题？是否有命令行参数来获取内存使用的堆栈跟踪？跟踪此问题的任何其他方式？

附言。我手动编写了错误消息，无法从有问题的系统复制粘贴。所以请忽略任何错字，因为这是我的愚蠢错误。

编辑：更新到这个。我又跑了几次；虽然我总是得到错误 GC 开销限制超出消息，但我并不总是得到 log4j 的堆栈跟踪。所以问题可能不是 log4j，而是 log4j 碰巧失败，原因是……其他原因导致的内存不足？

score 0 · Accepted Answer

“超出 GC 开销限制”可能意味着正在创建许多短期对象，超过 GC 可以处理的数量，而不会消耗超过 98% 的总时间。请参阅此问题，了解如何使用JProfiler找到有问题的类和分配点。

免责声明：我公司开发 JProfiler。

java - hadoop中明显的内存泄漏

1 回答 1

Related

Reference