java - 不使用 ZipFile 随机访问压缩文件（因为 ZipFile 有一个重大错误）

Question

我知道，我知道，谁会想要在 java 中压缩或未压缩的大文件。完全不合理。暂时不要怀疑，假设我有充分的理由解压缩一个大的 zip 文件。

问题 1：ZipFile有一个错误 (bug # 6280693)，sun 在 java 1.6 (Mustang) 中修复了这个问题。由于我们的软件需要支持 java 1.4，因此修复并没有帮助。据我了解，该错误是这样工作的。运行以下代码时，Java 会分配一块足够大的内存来保存整个文件。

ZipFile zipFile = new ZipFile("/tmp/myFile.zip");

如果 /tmp/myFile.zip 是 4gb，java 分配 4gb。这会导致堆外异常。不幸的是，+4gb 的堆大小不是一个可接受的解决方案。=(

问题1的解决方案：使用ZipInputStream，将文件作为流处理，从而减少和控制内存占用。

byte[] buf = new byte[1024];
FileInputStream fs = new FileInputStream("/tmp/myFile.zip")
ZipInputStream zipIn = new ZipInputStream(fs);

ZipEntry ze = zipIn.getNextEntry();

while (ze != null){
  while ((int cr = zipIn.read(buf, 0, 1024)) > -1) 
    System.out.write(buf, 0, len);
  ze = zipIn.getNextEntry();
}

问题 2：我想随机访问 ZipEntries。也就是说，我只想解压缩一个 ZipEntry，而不必搜索整个流。目前我正在建立一个名为 zes 的 zipEntries 列表：

        ZipInputStream zin = new ZipInputStream("/tmp/myFile.zip");

        ZipEntry ze = zin.getNextEntry();
        List<ZipEntry> zes = new ArrayList<ZipEntry>();

        while(ze!=null){
            zes.add(ze);
            ze = zin.getNextEntry();
        }

然后，当我需要解压缩特定的 zipEntry 时，我会遍历所有 zipEntry，直到找到匹配的 zipEntry，然后将其解压缩。

        ZipEntry ze = in.getNextEntry();
        while (! ze.getName().equals(queryZe.getName())){
            ze = zin.getNextEntry();
        }

        int cr;

        while ((cr = zin.read(buf)) > -1) 
            System.out.write(buf, 0, cr);

问：ZipFile 有能力随机访问 ZipEntries。

new BufferedInputStream(zipFile.getInputStream(zipEntry));

我如何在不使用 ZipFile 的情况下获得同样的能力？

请注意 ZipInputStream 有一些相当奇怪的行为。

可以在这里找到关于 java 和 ZipFiles 的特别好的文档：

http://commons.apache.org/compress/zip.html

按照答案中的建议，将 sun ZipFile 替换为 apache commons ZipFile 的注意事项：

Sun 的ZipFile.entries()总是按照它们在文件中出现的顺序返回 ZipEntries，而 apache commons ZipFile.getEntries()以随机顺序返回条目。这导致了一个有趣的错误，因为一些代码假设条目是“按顺序”的。

score 4 · Accepted Answer

For this task, you may want to look at Apache Commons Compress, Apache Commons VFS, or TrueZip. All of these should be Java 1.4 compatible, and probably support the features you need.

score 2 · Accepted Answer

您可以查看Apache Commons Compress，它适用于 1.4+，但我不知道它是否暴露了相同的错误。

java - 不使用 ZipFile 随机访问压缩文件（因为 ZipFile 有一个重大错误）

2 回答 2

Related

Reference