3

I am using java odftoolkit library (simple-odf-0.6.6) for odf document manipulations. We iterate all documents in the loop :

TextDocument textdoc = TextDocument.loadDocument(odtFileName);
.
changing content of document
.
textdoc.save(anotherOdtFileName);
textdoc.close();
//then all resources/streams are correctly closed, checked that many times by my colleagues :)

As we are iterating thousands of documents, java app slowly takes all memory and then everything slows down as GC is trying to free some memory. We are not getting OutOfMemoryException.

I tried to tune JVM memory sizes and GC options (http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html) - application whitstand few more minutes but then is all memory consumed again.

This is sample from dump taken when application reaches of all available memory :

652.147: [Full GC 652.147: [Tenured: 454655K->454655K(454656K), 2.2387530 secs] 659263K->659216K(659264K), [Perm : 41836K->41836K(42112K)], 2.2388570 secs] [Times: user=2.25 sys=0.00, real=2.23 secs] 
654.387: [Full GC 654.387: [Tenured: 454656K->454656K(454656K), 2.2661510 secs] 659263K->659223K(659264K), [Perm : 41836K->41836K(42112K)], 2.2663190 secs] [Times: user=2.26 sys=0.00, real=2.26 secs] 
656.654: [Full GC 656.654: [Tenured: 454656K->454656K(454656K), 2.4117680 secs] 659263K->659229K(659264K), [Perm : 41836K->41836K(42112K)], 2.4118970 secs] [Times: user=2.41 sys=0.00, real=2.41 secs]

as you can see just few kB has been released and GC is very slow (over 2 seconds).

This jmap histogram shows biggest consumers :

 num     #instances         #bytes  class name
----------------------------------------------
   1:       2535190       99077856  [C
   2:       2529791       60714984  java.lang.String
   3:         21085       27956544  [B
   4:        389820       16181680  [Ljava.lang.Object;
   5:        147111       13373896  [Ljava.util.HashMap$Entry;
   6:        108426       13180496  <constMethodKlass>
   7:        518834       12452016  java.util.HashMap$Entry
   8:        108426        9547136  <methodKlass>
   9:        321713        7721112  java.util.Vector
  10:        306308        7351392  org.apache.xerces.dom.AttributeMap
  11:        144353        6928944  java.util.HashMap
  12:         10230        5879960  <constantPoolKlass>
  13:         10230        5089344  <instanceKlassKlass>
  14:        114065        4562600  org.odftoolkit.odfdom.dom.attribute.text.TextStyleNameAttribute
  15:         58248        4193856  org.odftoolkit.odfdom.incubator.doc.text.OdfTextParagraph
  16:         90041        3601640  org.odftoolkit.odfdom.pkg.OdfAlienAttribute
  17:         36437        3459000  [I
  18:         48609        3110976  java.util.zip.ZipEntry
  19:          7454        2939616  <constantPoolCacheKlass>
  20:         36491        2627352  org.odftoolkit.odfdom.incubator.doc.style.OdfStyle
  21:         36399        2620728  org.odftoolkit.odfdom.incubator.doc.text.OdfTextSpan
  22:         58397        2335880  org.odftoolkit.odfdom.dom.attribute.style.StyleNameAttribute
  23:         65517        2096544  org.apache.xerces.dom.TextImpl
  24:         24270        1747440  org.odftoolkit.odfdom.incubator.doc.text.OdfTextListLevelStyleBullet
  25:         36511        1460440  org.odftoolkit.odfdom.dom.attribute.style.StyleFamilyAttribute
  26:         24335        1362760  org.odftoolkit.odfdom.dom.element.style.StyleParagraphPropertiesElement
  27:         24320        1361920  org.odftoolkit.odfdom.dom.element.style.StyleListLevelPropertiesElement
  28:         24320        1361920  org.odftoolkit.odfdom.dom.element.style.StyleListLevelLabelAlignmentElement
  29:         10933        1316952  java.lang.Class
  30:         29175        1167000  org.odftoolkit.odfdom.dom.attribute.style.StyleParentStyleNameAttribute
  31:         19464        1089984  org.odftoolkit.odfdom.dom.element.style.StyleFontFaceElement
  32:         68003        1088048  java.lang.Integer
  33:          3531        1082640  <methodDataKlass>
  34:         26757        1070280  org.odftoolkit.odfdom.dom.attribute.fo.FoMarginLeftAttribute
  35:         26752        1070080  org.odftoolkit.odfdom.dom.attribute.fo.FoTextIndentAttribute
  36:         24330         973200  org.odftoolkit.odfdom.dom.attribute.style.StyleWritingModeAttribute
  32:         68003        1088048  java.lang.Integer
  33:          3531        1082640  <methodDataKlass>
  34:         26757        1070280  org.odftoolkit.odfdom.dom.attribute.fo.FoMarginLeftAttribute
  35:         26752        1070080  org.odftoolkit.odfdom.dom.attribute.fo.FoTextIndentAttribute
  36:         24330         973200  org.odftoolkit.odfdom.dom.attribute.style.StyleWritingModeAttribute
  37:         24320         972800  org.odftoolkit.odfdom.dom.attribute.text.TextListLevelPositionAndSpaceModeAttribute
  38:         24320         972800  org.odftoolkit.odfdom.dom.attribute.text.TextLevelAttribute
  39:         24320         972800  org.odftoolkit.odfdom.dom.attribute.text.TextListTabStopPositionAttribute
  40:         24320         972800  org.odftoolkit.odfdom.dom.attribute.text.TextLabelFollowedByAttribute
  41:         24315         972600  org.odftoolkit.odfdom.dom.attribute.fo.FoLineHeightAttribute
  42:         24270         970800  org.odftoolkit.odfdom.dom.attribute.text.TextBulletCharAttribute
  43:         17064         955584  org.odftoolkit.odfdom.dom.element.style.StyleTextPropertiesElement
  44:         38372         920928  java.util.ArrayList
  45:         12135         873720  org.odftoolkit.odfdom.dom.element.text.TextAElement

as you can see there`s a lot of odftoolkit related classes in the memory.

Is there any effective way how to deal with this problem ? Would be great to have possibility to unload odftoolkit from our app at runtime and load it again to get of rid of all objects in memory (obviously it`s linked together, GC cannot do anything useful).

We are considering also to run critical code as separate process for smaller groups of documents, but that does not solve cause of problem.

4

3 回答 3

4

您很可能有存储泄漏,要么是由于库本身的问题,要么是因为您没有正确使用它。我们需要一个正确构建的最小可重现示例来知道哪个。

您所看到的性能不佳是典型的“GC 死亡螺旋”行为,即应用程序花费越来越多的时间运行 GC 并回收越来越少的内存。在 GC 颠簸数分钟或数小时后,它很可能最终会导致 OOME。

UseGCOverheadLimit处理死亡螺旋的方法是限制使用JVM 开关进行垃圾收集所花费的时间。如果 GC 占用的时间超过指定比例,JVM 会主动抛出 OOME,并显示消息“GC 开销限制超出”。这是一件好事......一般来说。

然后您尝试追踪存储泄漏。


许多资源都涵盖了跟踪 Java 存储泄漏的内容。对于初学者,这里是关于该主题的 StackOverflow 问答:

基本思想是使用一种工具(其中有很多)来识别正在泄漏的对象,并通过对象引用链返回以找出为什么它们在不应该的情况下可以访问。

于 2013-05-29T07:21:18.527 回答
2

您可以编写一些测试来验证一些事情:

  1. 在最简单的情况下,lib 本身是否泄漏?使用例如代码:

    public void loadSaveDocument(String fileInName, String fileOutName) throws Exception {
        OutputStream fileOutStream = new FileOutputStream(fileOutName);
        TextDocument textdoc = TextDocument.loadDocument(fileInName);
        textdoc.addParagraph("added text");
        textdoc.save(fileOutStream);
        textdoc.close();
    }
    
  2. 如果是,一个务实的解决方案是使用前面建议的单独过程找到解决方法。

  3. 如果不是,是否有特定文件导致泄漏?尝试通过上面的代码运行所有文档。
  4. 或者,如果没有导致泄漏的特定文档,那么您正在使用的文档的特定修改可能导致泄漏?
  5. 如果不是,请检查与您的应用程序代码相比,简单代码有什么不同。

我还建议使用像 JProfiler 这样的分析器并拍摄一些快照。请参阅如何使用 JProfiler 在 java 中查找内存泄漏?有关如何使用它的答案。

于 2013-05-29T09:37:39.450 回答
0

您可以查看 odf 库以找出您碰巧使用的版本的任何预先存在的问题。

此外,一种解决方法是通过 System.exec 调用将所有文档处理分叉到另一个 vm

于 2013-05-29T08:49:09.373 回答