8

在 iOS 上搜索如何对 gzip 压缩数据进行充气时,结果中出现以下方法:

- (NSData *)gzipInflate
{
    if ([self length] == 0) return self;

    unsigned full_length = [self length];
    unsigned half_length = [self length] / 2;

    NSMutableData *decompressed = [NSMutableData dataWithLength: full_length + half_length];
    BOOL done = NO;
    int status;

    z_stream strm;
    strm.next_in = (Bytef *)[self bytes];
    strm.avail_in = [self length];
    strm.total_out = 0;
    strm.zalloc = Z_NULL;
    strm.zfree = Z_NULL;

    if (inflateInit2(&strm, (15+32)) != Z_OK) return nil;
    while (!done)
    {
        // Make sure we have enough room and reset the lengths.
        if (strm.total_out >= [decompressed length])
            [decompressed increaseLengthBy: half_length];
        strm.next_out = [decompressed mutableBytes] + strm.total_out;
        strm.avail_out = [decompressed length] - strm.total_out;

        // Inflate another chunk.
        status = inflate (&strm, Z_SYNC_FLUSH);
        if (status == Z_STREAM_END) done = YES;
        else if (status != Z_OK) break;
    }
    if (inflateEnd (&strm) != Z_OK) return nil;

    // Set real length.
    if (done)
    {
        [decompressed setLength: strm.total_out];
        return [NSData dataWithData: decompressed];
    }
    else return nil;
}

但是我遇到了一些数据示例(在使用 Python 的gzip 模块的 Linux 机器上放气),这种在 iOS 上运行的方法无法膨胀。这是正在发生的事情:

在 while 循环的最后一次迭代中,inflate() 返回 Z_BUF_ERROR 并退出循环。但是在循环之后调用的 inflateEnd() 返回 Z_OK。然后代码假定由于 inflate() 从未返回 Z_STREAM_END,因此膨胀失败并返回 null。

根据此页面,http: //www.zlib.net/zlib_faq.html#faq05 Z_BUF_ERROR 不是致命错误,我对有限示例的测试表明,如果 inflateEnd() 返回 Z_OK,则数据成功膨胀,即使最后一次调用 inflate() 没有返回 Z_OK。似乎 inflateEnd() 完成了对最后一块数据的膨胀。

我不太了解压缩以及 gzip 的工作原理,因此在不完全了解它的作用的情况下,我很犹豫是否要更改此代码。我希望对该主题有更多了解的人可以对上面代码中的这个潜在的逻辑缺陷有所了解,并提出一种解决方法。

谷歌出现的另一种方法似乎遇到了同样的问题,可以在这里找到:https ://github.com/nicklockwood/GZIP/blob/master/GZIP/NSData%2BGZIP.m

编辑:

所以,这是一个错误!现在,我们如何解决它?下面是我的尝试。代码审查,有人吗?

- (NSData *)gzipInflate
{
    if ([self length] == 0) return self;

    unsigned full_length = [self length];
    unsigned half_length = [self length] / 2;

    NSMutableData *decompressed = [NSMutableData dataWithLength: full_length + half_length];
    int status;

    z_stream strm;
    strm.next_in = (Bytef *)[self bytes];
    strm.avail_in = [self length];
    strm.total_out = 0;
    strm.zalloc = Z_NULL;
    strm.zfree = Z_NULL;

    if (inflateInit2(&strm, (15+32)) != Z_OK) return nil;

    do
    {
        // Make sure we have enough room and reset the lengths.
        if (strm.total_out >= [decompressed length])
            [decompressed increaseLengthBy: half_length];
        strm.next_out = [decompressed mutableBytes] + strm.total_out;
        strm.avail_out = [decompressed length] - strm.total_out;

        // Inflate another chunk.
        status = inflate (&strm, Z_SYNC_FLUSH);

        switch (status) {
            case Z_NEED_DICT:
                status = Z_DATA_ERROR;     /* and fall through */
            case Z_DATA_ERROR:
            case Z_MEM_ERROR:
            case Z_STREAM_ERROR:
                (void)inflateEnd(&strm);
                return nil;
        }
    } while (status != Z_STREAM_END);

    (void)inflateEnd (&strm);

    // Set real length.
    if (status == Z_STREAM_END)
    {
        [decompressed setLength: strm.total_out];
        return [NSData dataWithData: decompressed];
    }
    else return nil;
}

编辑2:

这是一个示例 Xcode 项目,说明了我正在运行的问题。放气发生在服务器端,数据在通过 HTTP 传输之前是 base64 和 url 编码的。我在 ViewController.m 中嵌入了 url 编码的 base64 字符串。url-decode 和 base64-decode 以及您的 gzipInflate 方法都在 NSDataExtension.m

https://dl.dropboxusercontent.com/u/38893107/gzip/GZIPTEST.zip

这是由 python gzip 库压缩的二进制文件:

https://dl.dropboxusercontent.com/u/38893107/gzip/binary.zip

这是通过 HTTP 传输的 URL 编码的 base64 字符串: https ://dl.dropboxusercontent.com/u/38893107/gzip/urlEncodedBase64.txt

4

2 回答 2

8

是的,这是一个错误。

inflate()事实上,如果没有返回Z_STREAM_END,那么你还没有完成通货膨胀, 这实际上是正确的。inflateEnd()返回Z_OK并没有多大意义——只是它被赋予了一个有效的状态并且能够释放内存。

所以inflate()必须最终返回Z_STREAM_END才能宣布成功。然而Z_BUF_ERROR,这不是放弃的理由。在这种情况下,您只需inflate()使用更多输入或更多输出空间再次调用。然后你会得到Z_STREAM_END.

zlib.h中的文档:

/* ...
Z_BUF_ERROR if no progress is possible or if there was not enough room in the
output buffer when Z_FINISH is used.  Note that Z_BUF_ERROR is not fatal, and
inflate() can be called again with more input and more output space to
continue decompressing.
... */

更新:

由于那里漂浮着错误的代码,下面是实现所需方法的正确代码。此代码处理不完整的 gzip 流、连接的 gzip 流和非常大的 gzip 流。对于非常大的 gzip 流,编译为 64 位可执行文件时,其中的unsigned长度z_stream不够大。 NSUInteger是 64 位,而unsigned是 32 位。在这种情况下,您必须循环输入以将其提供给inflate().

这个例子只是简单地返回nil任何错误。错误的性质会在每个 之后的注释中注明return nil;,以防需要更复杂的错误处理。

- (NSData *) gzipInflate
{
    z_stream strm;

    // Initialize input
    strm.next_in = (Bytef *)[self bytes];
    NSUInteger left = [self length];        // input left to decompress
    if (left == 0)
        return nil;                         // incomplete gzip stream

    // Create starting space for output (guess double the input size, will grow
    // if needed -- in an extreme case, could end up needing more than 1000
    // times the input size)
    NSUInteger space = left << 1;
    if (space < left)
        space = NSUIntegerMax;
    NSMutableData *decompressed = [NSMutableData dataWithLength: space];
    space = [decompressed length];

    // Initialize output
    strm.next_out = (Bytef *)[decompressed mutableBytes];
    NSUInteger have = 0;                    // output generated so far

    // Set up for gzip decoding
    strm.avail_in = 0;
    strm.zalloc = Z_NULL;
    strm.zfree = Z_NULL;
    strm.opaque = Z_NULL;
    int status = inflateInit2(&strm, (15+16));
    if (status != Z_OK)
        return nil;                         // out of memory

    // Decompress all of self
    do {
        // Allow for concatenated gzip streams (per RFC 1952)
        if (status == Z_STREAM_END)
            (void)inflateReset(&strm);

        // Provide input for inflate
        if (strm.avail_in == 0) {
            strm.avail_in = left > UINT_MAX ? UINT_MAX : (unsigned)left;
            left -= strm.avail_in;
        }

        // Decompress the available input
        do {
            // Allocate more output space if none left
            if (space == have) {
                // Double space, handle overflow
                space <<= 1;
                if (space < have) {
                    space = NSUIntegerMax;
                    if (space == have) {
                        // space was already maxed out!
                        (void)inflateEnd(&strm);
                        return nil;         // output exceeds integer size
                    }
                }

                // Increase space
                [decompressed setLength: space];
                space = [decompressed length];

                // Update output pointer (might have moved)
                strm.next_out = (Bytef *)[decompressed mutableBytes] + have;
            }

            // Provide output space for inflate
            strm.avail_out = space - have > UINT_MAX ? UINT_MAX :
                             (unsigned)(space - have);
            have += strm.avail_out;

            // Inflate and update the decompressed size
            status = inflate (&strm, Z_SYNC_FLUSH);
            have -= strm.avail_out;

            // Bail out if any errors
            if (status != Z_OK && status != Z_BUF_ERROR &&
                status != Z_STREAM_END) {
                (void)inflateEnd(&strm);
                return nil;                 // invalid gzip stream
            }

            // Repeat until all output is generated from provided input (note
            // that even if strm.avail_in is zero, there may still be pending
            // output -- we're not done until the output buffer isn't filled)
        } while (strm.avail_out == 0);

        // Continue until all input consumed
    } while (left || strm.avail_in);

    // Free the memory allocated by inflateInit2()
    (void)inflateEnd(&strm);

    // Verify that the input is a valid gzip stream
    if (status != Z_STREAM_END)
        return nil;                         // incomplete gzip stream

    // Set the actual length and return the decompressed data
    [decompressed setLength: have];
    return decompressed;
}
于 2013-07-23T22:38:16.763 回答
2

是的,看起来像一个错误。根据来自 zlib 站点的这个带注释的示例,这只是Z_BUF_ERROR一个指示,除非为 inflate() 提供更多输入,否则没有更多的输出,这本身并不是异常中止 inflate 循环的理由。

事实上,链接样本的处理方式似乎Z_BUF_ERRORZ_OK.

于 2013-07-23T21:08:03.283 回答