I'm trying to uncompress a GZIPed HTTP Response by using GZIPInputStream
. However I always have the same exception when I try to read the stream : java.util.zip.ZipException: invalid bit length repeat
My HTTP Request Header:
GET www.myurl.com HTTP/1.0\r\n
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6\r\n
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n
Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3\r\n
Accept-Encoding: gzip,deflate\r\n
Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7\r\n
Keep-Alive: 115\r\n
Connection: keep-alive\r\n
X-Requested-With: XMLHttpRequest\r\n
Cookie: Some Cookies\r\n\r\n
At the end of the HTTP Response header, I get path=/Content-Encoding: gzip
, followed by the gziped response.
I tried 2 similars codes to uncompress :
UPDATE : In the following codes, tBytes = (the string after 'path=/Content-Encoding: gzip').getBytes ();
GZIPInputStream gzip = new GZIPInputStream (new ByteArrayInputStream (tBytes));
StringBuffer szBuffer = new StringBuffer ();
byte tByte [] = new byte [1024];
while (true)
{
int iLength = gzip.read (tByte, 0, 1024); // <-- Error comes here
if (iLength < 0)
break;
szBuffer.append (new String (tByte, 0, iLength));
}
And this one that I get on this forum :
InputStream gzipStream = new GZIPInputStream (new ByteArrayInputStream (tBytes));
Reader decoder = new InputStreamReader (gzipStream, "UTF-8");//<- I tried ISO-8859-1 and get the same exception
BufferedReader buffered = new BufferedReader (decoder);
I guess this is an encoding error.
Best regards,
bill0ute
You don't show how you get the
tBytes
that you use to set up the gzip stream here:One explanation is that you are including the entire HTTP response in
tBytes
. Instead, it should be only the content after the HTTP headers.Another explanation is that the response is chunked.
edit: You are taking the data after the content-encoding line as the message body. However, according to the HTTP 1.1 specification the header fields do not come in any particular order, so this is very dangerous.
As explained in this part of the HTTP specification, the message body of a request or response doesn't come after a particular header field but after the first empty line:
You still haven't show how exactly you compose
tBytes
, but at this point I think you're erroneously including the empty line in the data that you try to decompress. The message body starts after the CRLF characters of the empty line.May I suggest that you use the httpclient library instead to extract the message body?
Well there is the problem I can see here;
Use following to fix that;