java - Memory required by JVM for creating CSV files and zip it on the fly

Question

I am creating two CSV files using String buffers and byte arrays.
I use ZipOutputStream to generate the zip files. Each csv file will have 20K records with 14 columns. Actually the records are fetched from DB and stored in ArrayList. I have to iterate the list and build StringBuffer and convert the StringBuffer to byte Array to wirte it to the zip entry.

I want to know the memory required by JVM to do the entire process starting from storing the records in the ArrayList.
I have provide the code snippet below.

StringBuffer responseBuffer = new StringBuffer();
    String response = new String();
    response = "Hello, sdksad, sfksdfjk, World, Date, ask, askdl, sdkldfkl, skldkl, sdfklklgf, sdlksldklk, dfkjsk, dsfjksj, dsjfkj, sdfjkdsfj\n";
    for(int i=0;i<20000;i++){
        responseBuffer.append(response);
    }
    response = responseBuffer.toString();
    byte[] responseArray = response.getBytes();
    res.setContentType("application/zip");
    ZipOutputStream zout = new ZipOutputStream(res.getOutputStream());
    ZipEntry parentEntry = new ZipEntry("parent.csv");
    zout.putNextEntry(parentEntry);
    zout.write(responseArray);
    zout.closeEntry();
    ZipEntry childEntry = new ZipEntry("child.csv");
    zout.putNextEntry(childEntry);
    zout.write(responseArray);
    zout.closeEntry();
    zout.close();

Please help me with this. Thanks in advance.

score 3 · Accepted Answer

要分析内存使用情况，您可以使用Profiler。

JProfiler或YourKit非常擅长这样做。

VisualVM在一定程度上也不错。

score 3 · Accepted Answer

我猜您已经尝试计算将分配给 StringBuffer 和字节数组的字节数。但问题是您无法真正知道您的应用程序将使用多少内存，除非您对 CSV 记录的大小有上限。如果你希望你的软件稳定、健壮和可扩展，恐怕你问错了问题：你应该努力使用固定数量的内存来执行你需要完成的任务，在你的情况下似乎很容易。

关键是，在您的情况下，处理完全是 FIFO - 您从数据库中读取记录，然后将它们（以相同的顺序）写入 FIFO 流（OutputStream在这种情况下）。甚至 zip 压缩也是基于流的，并且在内部使用固定数量的内存，因此您在那里完全安全。

而不是将整个输入缓冲在一个巨大的字符串中，然后将其转换为一个巨大的字节数组，然后将其写入输出流 - 您应该从数据库中单独读取每个响应元素（或固定大小的块，例如 100 条记录） time)，并将其写入输出流。就像是

res.setContentType("application/zip");
ZipOutputStream zout = new ZipOutputStream(res.getOutputStream());
ZipEntry parentEntry = new ZipEntry("parent.csv");
zout.putNextEntry(parentEntry);
while (... fetch entries ...)
    zout.write(...data...)
zout.closeEntry();

这种方法的优点是因为它适用于小块，您可以轻松估计它们的大小，并为您的 JVM 分配足够的内存，使其永远不会崩溃。而且您知道，如果您的 CSV 文件将来超过 20K 行，它仍然可以工作。

score 0 · Accepted Answer

您可以使用测量内存MemoryTestbench。

http://www.javaspecialists.eu/archive/Issue029.html

这篇文章描述了该怎么做。它简单，精确到 1 个字节，我经常使用它。
它甚至可以从 junit 测试用例运行，所以它非常有用，而分析器不能从 junit 测试用例运行。

使用这种方法，您甚至可以测量一个 Integer 对象的内存大小。

但是使用 zip 有一件特别的事情。Zipstream 使用本地 c 库，在这种情况下 MemoryTestbench 可能无法测量该内存，只有 java 部分。
您应该尝试两种变体，即 MemroyTestbench，以及使用分析器 (jprof)。

java - Memory required by JVM for creating CSV files and zip it on the fly

3 回答 3

Related

Reference