I use TIdHttp to fetch web content. The response header indicates the content encoding to be utf8. I want to print content in console as CP936 (simplified chinese), but the actual content is not readable.
Result := TEncoding.Utf8.GetString(ResponseBuffer);
I do the same thing in python (using httplib2) without any problems.
def python_try():
conn = httplib2.HttpConn()
respose, content = conn.get(...)
print content.decode('utf8') # readable in console
UPDATE 1
I debugged the raw response and noticed that the content is gzipped.
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Content-Type: text/html;charset=UTF-8
Transfer-Encoding: chunked
Content-Encoding: gzip
Vary: Accept-Encoding
Date: Mon, 24 Dec 2012 15:27:44 GMT
Connection: Keep-Alive
I tried to assign a IdCompressorZLib instance to IdHttp instance. Unfortunately, the application will crash while decompressing gzipped content. The test address is "http\://www.baidu.com" (encoding=gb2312).
UPDATE 2
I also tried to download a gzipped jquery script file, which contains only ascii chars. This time it works, which means to be a problem of Indy library. If I were not wrong, I should close the question.