There I am trying to load XML content from a remote host using Node.js.
The problem is that German "umlaute" like "ä" are broken. Like in the browser this usually is a simple encoding problem. But since the XML content on the remote host is encoded in iso-8859-2" I had no success getting the letters back to work.
The functionality is very simple. I simply use the default HTTP client integrated in Node.js to connect to a remote host with a simple get request.
Some environment facts:
- The remote system uses "iso-8859-2" encoding.
- The encoding is currently set in the response header.
- The characters are unrecoverable broken in the data (chunk) received by
response.onData(chunk)
Node.js is running on version 0.2 on da default Debian server.
The code is based on the default httpClient like described in the Node.js documentation.
I tried the following:
response.defaultAsciiEncoding true/false
response.encoding = UFT-8/ascii
I used a UTF-8 encoder/decoder to encode/decode the chunk. After this failed I tried to encode/decode the whole response body.
I am not very familiar with using buffers, and I guess the problem must be in that direction. Or Node.js (or the httpClient) simply can't handle other encoding types by default witch is my second guess. In this case I need to write my own HTTP client using the net lib I think. I just want to make sure I don't walk into the wrong direction :)