HTTP gzip encoding of html

2019-09-08 07:26发布

For a project of mine i'm having to code my own lite webserver. At the moment it's doing what i want it to do, but kinda ... slow. at least to slow for me. Therefore i was thinking about implementing gzip compression to speed things up. Here's how.

public static String encodeToGZip(String data) {
        ByteArrayOutputStream bout = null;
        try {
            bout = new ByteArrayOutputStream();
            GZIPOutputStream output = new GZIPOutputStream(bout);
            output.write(data.getBytes());
            output.flush();
            output.close();
            bout.close();
        } catch (IOException ex) {
            ex.printStackTrace();
        }

        try {
            return new String(bout.toByteArray(), "UTF-8");
        } catch (UnsupportedEncodingException ex) {
            return null;
        }
    }

the problem is that the webserver can't decode the data i've sent. eventhough it states that it accepts gzip encoding so i must be sending some corrupt data.

this is the result. wireshark sniff==> GET /login.html HTTP/1.1

Host: localhost:9090

Connection: keep-alive

Cache-Control: no-cache

Pragma: no-cache

User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.79 Safari/535.11

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8

Accept-Encoding: gzip,deflate,sdch

Accept-Language: en-US,en;q=0.8

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3


HTTP/1.1 200 OK

Connection: close

Server: My Lite Server v0

Content-Encoding: gzip

Content-Type: text/html

...............T...N...0....#.......O...?...$...........BB...g...6...[.....u...........6......................g6e...............S......c..$..........`I Gw............AOAhU...XO...d...].... IU...h...+......[.....Y.........b...|x.........rm1.........1.....L...uI.........S...n............F......T2.[$X.......M.....M.#*...........d....58HL:....Wx......Z...........m...t...Z.)'XQdg ......X.........~......(......<.......p/....... ..........."...6|7........3 ...r.Sv.../...rT...."..........SrJ..........M.vR^...4$... .q...x.................../...8...........M...y#...j......7........d..le....;..................~......o....F......

1条回答
劫难
2楼-- · 2019-09-08 08:14
return new String(bout.toByteArray(), "UTF-8");

This line in your method will produce garbage strings.

The above constructor performs a transcoding operation from the given encoding to UTF-16. You take a bunch of arbitrary bytes and try to decode them as UTF-8. You can only decode UTF-8 encoded character data as UTF-8. Java does not have binary-safe strings (all strings are UTF-16); you must use byte arrays instead.

Just write the compressed bytes to your OutputStream.

Avoid using data.getBytes() as it uses the default system encoding. This will produce non-portable code as the default system encoding is system and configuration dependent. Prefer always setting an encoding explicitly.

查看更多
登录 后发表回答