这是gzip膨胀方法中的错误吗?

当search如何在iOS上对gzip压缩数据进行充气时,结果数量出现以下方法:

- (NSData *)gzipInflate { if ([self length] == 0) return self; unsigned full_length = [self length]; unsigned half_length = [self length] / 2; NSMutableData *decompressed = [NSMutableData dataWithLength: full_length + half_length]; BOOL done = NO; int status; z_stream strm; strm.next_in = (Bytef *)[self bytes]; strm.avail_in = [self length]; strm.total_out = 0; strm.zalloc = Z_NULL; strm.zfree = Z_NULL; if (inflateInit2(&strm, (15+32)) != Z_OK) return nil; while (!done) { // Make sure we have enough room and reset the lengths. if (strm.total_out >= [decompressed length]) [decompressed increaseLengthBy: half_length]; strm.next_out = [decompressed mutableBytes] + strm.total_out; strm.avail_out = [decompressed length] - strm.total_out; // Inflate another chunk. status = inflate (&strm, Z_SYNC_FLUSH); if (status == Z_STREAM_END) done = YES; else if (status != Z_OK) break; } if (inflateEnd (&strm) != Z_OK) return nil; // Set real length. if (done) { [decompressed setLength: strm.total_out]; return [NSData dataWithData: decompressed]; } else return nil; } 

但是我遇到了一些数据的例子(在Python的gzip模块的Linux机器上放缩),这个在iOS上运行的方法没有膨胀。 这是发生了什么事情:

在while循环的最后迭代中,inflate()返回Z_BUF_ERROR,循环被退出。 但是,在循环之后调用的inflateEnd()将返回Z_OK。 然后代码假设由于inflate()永远不会返回Z_STREAM_END,通货膨胀失败并返回null。

根据这个页面, http: //www.zlib.net/zlib_faq.html#faq05 Z_BUF_ERROR不是一个致命的错误,我的testing与有限的例子显示,如果inflateEnd()返回Z_OK,数据成功充气,即使inflate()的最后一个调用没有返回Z_OK。 似乎inflateEnd()完成了膨胀的最后一块数据。

我对压缩知之甚less,gzip是如何工作的,所以我不愿意对这些代码进行修改而不完全理解它的作用。 我希望有更多关于这个主题的知识的人可以在上面的代码中发现这个潜在的逻辑缺陷,并提出一种解决方法。

谷歌出现的另一种方法似乎也遭遇同样的问题,可以在这里find: https : //github.com/nicklockwood/GZIP/blob/master/GZIP/NSData%2BGZIP.m

编辑:

所以,这是一个错误! 现在,我们如何解决它? 以下是我的尝试。 代码审查,任何人?

 - (NSData *)gzipInflate { if ([self length] == 0) return self; unsigned full_length = [self length]; unsigned half_length = [self length] / 2; NSMutableData *decompressed = [NSMutableData dataWithLength: full_length + half_length]; int status; z_stream strm; strm.next_in = (Bytef *)[self bytes]; strm.avail_in = [self length]; strm.total_out = 0; strm.zalloc = Z_NULL; strm.zfree = Z_NULL; if (inflateInit2(&strm, (15+32)) != Z_OK) return nil; do { // Make sure we have enough room and reset the lengths. if (strm.total_out >= [decompressed length]) [decompressed increaseLengthBy: half_length]; strm.next_out = [decompressed mutableBytes] + strm.total_out; strm.avail_out = [decompressed length] - strm.total_out; // Inflate another chunk. status = inflate (&strm, Z_SYNC_FLUSH); switch (status) { case Z_NEED_DICT: status = Z_DATA_ERROR; /* and fall through */ case Z_DATA_ERROR: case Z_MEM_ERROR: case Z_STREAM_ERROR: (void)inflateEnd(&strm); return nil; } } while (status != Z_STREAM_END); (void)inflateEnd (&strm); // Set real length. if (status == Z_STREAM_END) { [decompressed setLength: strm.total_out]; return [NSData dataWithData: decompressed]; } else return nil; } 

编辑2:

下面是一个示例Xcode项目,演示了我正在运行的问题。deflate发生在服务器端,数据在通过HTTP传输之前是base64和url编码的。 我已经在ViewController.m中embedded了url编码的base64string。 url解码和base64解码以及你的gzipInflate方法在NSDataExtension.m中

https://dl.dropboxusercontent.com/u/38893107/gzip/GZIPTEST.zip

以下是由python gzip库放弃的二进制文件:

https://dl.dropboxusercontent.com/u/38893107/gzip/binary.zip

这是通过HTTP传输的URL编码base64string: https : //dl.dropboxusercontent.com/u/38893107/gzip/urlEncodedBase64.txt

是的,这是一个错误。

事实上,如果Z_STREAM_END inflate()不返回Z_STREAM_END ,那么你还没有完成通货膨胀。 inflateEnd()返回的Z_OK并不意味着太多 – 只是它被赋予了一个有效的状态并能够释放内存。

所以Z_STREAM_END inflate()最终必须返回Z_STREAM_END才能声明成功。 但是Z_BUF_ERROR不是放弃的理由。 在这种情况下,您只需再次input更多的input或输出空间即可调用inflate() 。 然后您将获得Z_STREAM_END

从zlib.h中的文档:

 /* ... Z_BUF_ERROR if no progress is possible or if there was not enough room in the output buffer when Z_FINISH is used. Note that Z_BUF_ERROR is not fatal, and inflate() can be called again with more input and more output space to continue decompressing. ... */ 

更新:

由于在那里有浮动的代码,下面是实现所需方法的正确代码。 此代码处理不完整的gzipstream,连接的gzipstream和非常大的gzipstream。 对于非常大的gzipstream,当编译为64位可执行文件时, z_stream中的unsigned长度不够大。 NSUInteger是64位,而无unsigned是32位。 在这种情况下,您必须循环input以将其馈送给inflate()

这个例子简单地返回nil的任何错误。 错误的性质在每个return nil;后在注释中注明return nil; 在需要更复杂的error handling的情况下。

 - (NSData *) gzipInflate { z_stream strm; // Initialize input strm.next_in = (Bytef *)[self bytes]; NSUInteger left = [self length]; // input left to decompress if (left == 0) return nil; // incomplete gzip stream // Create starting space for output (guess double the input size, will grow // if needed -- in an extreme case, could end up needing more than 1000 // times the input size) NSUInteger space = left << 1; if (space < left) space = NSUIntegerMax; NSMutableData *decompressed = [NSMutableData dataWithLength: space]; space = [decompressed length]; // Initialize output strm.next_out = (Bytef *)[decompressed mutableBytes]; NSUInteger have = 0; // output generated so far // Set up for gzip decoding strm.avail_in = 0; strm.zalloc = Z_NULL; strm.zfree = Z_NULL; strm.opaque = Z_NULL; int status = inflateInit2(&strm, (15+16)); if (status != Z_OK) return nil; // out of memory // Decompress all of self do { // Allow for concatenated gzip streams (per RFC 1952) if (status == Z_STREAM_END) (void)inflateReset(&strm); // Provide input for inflate if (strm.avail_in == 0) { strm.avail_in = left > UINT_MAX ? UINT_MAX : (unsigned)left; left -= strm.avail_in; } // Decompress the available input do { // Allocate more output space if none left if (space == have) { // Double space, handle overflow space <<= 1; if (space < have) { space = NSUIntegerMax; if (space == have) { // space was already maxed out! (void)inflateEnd(&strm); return nil; // output exceeds integer size } } // Increase space [decompressed setLength: space]; space = [decompressed length]; // Update output pointer (might have moved) strm.next_out = (Bytef *)[decompressed mutableBytes] + have; } // Provide output space for inflate strm.avail_out = space - have > UINT_MAX ? UINT_MAX : (unsigned)(space - have); have += strm.avail_out; // Inflate and update the decompressed size status = inflate (&strm, Z_SYNC_FLUSH); have -= strm.avail_out; // Bail out if any errors if (status != Z_OK && status != Z_BUF_ERROR && status != Z_STREAM_END) { (void)inflateEnd(&strm); return nil; // invalid gzip stream } // Repeat until all output is generated from provided input (note // that even if strm.avail_in is zero, there may still be pending // output -- we're not done until the output buffer isn't filled) } while (strm.avail_out == 0); // Continue until all input consumed } while (left || strm.avail_in); // Free the memory allocated by inflateInit2() (void)inflateEnd(&strm); // Verify that the input is a valid gzip stream if (status != Z_STREAM_END) return nil; // incomplete gzip stream // Set the actual length and return the decompressed data [decompressed setLength: have]; return decompressed; } 

是的,看起来像一个错误。 根据这个来自zlib站点的注释示例 , Z_BUF_ERROR仅仅是一个指示,除非Z_BUF_ERROR ()提供了更多的input,否则不会有更多的输出,本身不是一个exception中止中止循环的理由。

实际上,链接的样本似乎像Z_OK一样处理Z_OK

Interesting Posts