5

I have a 7zip archive craeted with LZMA2 compression (compression level: ultra). The archive contains 1,749 files, which in total originally had a size of 661mb. The zipped file is 39mb in size.

Now I'm trying to use C# to extract a tiny (~200kb'ish) single file from this archive.

I'm getting the corresponding IArchiveEntry from the IArchive (which works relatively fast), but then calling IArchiveEntry.WriteToFile(targetPath) takes around 33 seconds! And similarly long if I write to a memory stream instead. (edit: When I'm running this on a 7z LZMA2 archive with compression level = normal, it still takes 9 seconds)

When I'm opening the same archive in the actual 7zip application and extract the same file from there, it takes around 2-3 seconds only. I suspected it's some sort of multicore (7zip) vs single core (SharpCompress probably?) thing, but I don't notice any CPU usage spike during decompression with 7zip.. maybe its too fast to be noticeable though..

Does anyone know what could be the issue for such slow speeds with SharpCompress? Am I maybe missing some setting or using a wrong factory (ArchiveFactory) ?

If not - is there any C# library out there that might be significantly faster at decompressing this?

For reference, here's a sketch of how I'm using SharpCompress to extract:

private void Extract()
    {
        using(var archive = GetArchive())
        {
          var entryPath = /* ... path to entry .. */
          var entry = TryGetEntry(archive, entryPath);
          entry.WriteToFile(some_target_path);
        }
    }


    private IArchive GetArchive()
    {
        string path = /* .. path to my .7z file */;
        return ArchiveFactory.Open(path);
    }

    private IArchiveEntry TryGetEntry(IArchive archive, string path)
    {
        path = path.Replace("\\", "/");

        foreach (var entry in archive.Entries)
        {
            if (!entry.IsDirectory)
            {
                if (entry.Key == path)
                    return entry;
            }
        }

        return null;
    }

Update: For a temporary solution, I'm now including the 7zr.exe from the 7zip SDK in my application, and run this in a new process to extract a single file, reading the process' output into a binary stream. This works in around ~3 seconds compared to the ~33seconds with SharpCompress. Works for now, but kind of ugly.. so still curious why SharpCompress seems to be so slow there

4

2 回答 2

10

This line is the problem

foreach (var entry in archive.Entries)

The problem is described here (ie. If there are 100 files, it decompresses the 1st file 100 times, 2nd file 99 times, and so on)

You need to use reader (forward-only). See the API.
But the sample code there doesn't support 7z.

For 7z you can use archive.ExtractAllEntries(), eg.

var reader = archive.ExtractAllEntries();
while (reader.MoveToNextEntry())
{
    if (!reader.Entry.IsDirectory)
        reader.WriteEntryToDirectory(extractDir, new ExtractionOptions() { ExtractFullPath = false, Overwrite = true });
}

It will be much faster.

于 2017-06-06T00:12:47.510 回答
0

If you need all the files you could also do:

using var reader = archive.ExtractAllEntries();
reader.WriteAllToDirectory(targetPath, new ExtractionOptions() { ExtractFullPath = true, Overwrite = true });
于 2019-06-27T12:15:44.907 回答