java - 在 Java 中解压缩 GZIPed HTTP 响应

Question

我正在尝试使用GZIPInputStream. 但是，当我尝试读取流时，我总是遇到同样的异常：java.util.zip.ZipException: invalid bit length repeat

我的 HTTP 请求标头：

GET www.myurl.com HTTP/1.0\r\n
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6\r\n
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n
Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3\r\n
Accept-Encoding: gzip,deflate\r\n
Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7\r\n
Keep-Alive: 115\r\n
Connection: keep-alive\r\n
X-Requested-With: XMLHttpRequest\r\n
Cookie: Some Cookies\r\n\r\n

在 HTTP 响应标头的末尾，我得到path=/Content-Encoding: gzip，然后是压缩响应。

我尝试了 2 个类似的代码来解压缩：

更新：在以下代码中，tBytes = (the string after 'path=/Content-Encoding: gzip').getBytes ();

GZIPInputStream  gzip = new GZIPInputStream (new ByteArrayInputStream (tBytes));

StringBuffer  szBuffer = new StringBuffer ();

byte  tByte [] = new byte [1024];

while (true)
{
    int  iLength = gzip.read (tByte, 0, 1024); // <-- Error comes here

    if (iLength < 0)
        break;

    szBuffer.append (new String (tByte, 0, iLength));
}

我在这个论坛上得到的这个：

InputStream     gzipStream = new GZIPInputStream   (new ByteArrayInputStream (tBytes));
Reader          decoder    = new InputStreamReader (gzipStream, "UTF-8");//<- I tried ISO-8859-1 and get the same exception
BufferedReader  buffered   = new BufferedReader    (decoder);

我想这是一个编码错误。

此致，

账单

score 9 · Accepted Answer

您没有在这里展示如何获得tBytes用于设置 gzip 流的内容：

GZIPInputStream  gzip = new GZIPInputStream (new ByteArrayInputStream (tBytes));

一种解释是您将整个 HTTP 响应包含在tBytes. 相反，它应该只是 HTTP 标头之后的内容。

另一种解释是响应是分块的。

编辑：您将内容编码行之后的数据作为消息正文。但是，根据 HTTP 1.1 规范，标头字段没有按任何特定顺序出现，因此这是非常危险的。

正如HTTP 规范的这一部分所解释的，请求或响应的消息体不是出现在特定的标头字段之后，而是在第一个空行之后：

请求（第 5 节）和响应（第 6 节）消息使用 RFC 822 [9] 的通用消息格式来传输实体（消息的有效负载）。两种类型的消息都包含一个起始行、零个或多个标题字段（也称为“标题”）、一个指示标题字段结束的空行（即，在 CRLF 之前没有任何内容的行），可能还有一个邮件正文。

您仍然没有显示您的 compose 准确度tBytes，但在这一点上，我认为您错误地将空行包含在您尝试解压缩的数据中。消息正文在空行的 CRLF 字符之后开始。

我可以建议您改用httpclient库来提取消息正文吗？

score 1 · Accepted Answer

好吧，我可以在这里看到问题；

int  iLength = gzip.read (tByte, 0, 1024);

使用以下方法解决此问题；

        byte[] buff = new byte[1024];
byte[] emptyBuff = new byte[1024];
                            StringBuffer unGzipRes = new StringBuffer();

                            int byteCount = 0;
                            while ((byteCount = gzip.read(buff, 0, 1024)) > 0) {
                                // only append the buff elements that
                                // contains data
                                unGzipRes.append(new String(Arrays.copyOf(
                                        buff, byteCount), "utf-8"));

                                // empty the buff for re-usability and
                                // prevent dirty data attached at the
                                // end of the buff
                                System.arraycopy(emptyBuff, 0, buff, 0,
                                        1024);
                            }

java - 在 Java 中解压缩 GZIPed HTTP 响应

2 回答 2

Related

Reference