How to force XMLHttpRequest to use ISO-8859-1 char

2020-02-11 07:36发布

问题:

I have ISO-8859-1 database, so I like to exchange requests entirely in this codepage. So, how to set content-type for AJAX requests in the right way?

回答1:

Even though it's bad to do (bunch of comments above), this would work:

var xhr = new XMLHttpRequest(); 
xhr.open("GET", path, false);
xhr.setRequestHeader('Content-type', 'application/x-www-form-urlencoded; charset=ISO-8859-1')

If you are using jQuery: https://stackoverflow.com/a/553572/2527433



回答2:

According to the W3C spec for XMLHttpRequest.send(), the charset will end up being UTF-8 in almost all cases, depending on the value of data. Even any charset encoding you specify will likely be overwritten with UTF-8:

If a Content-Type header is in author request headers and its value is a valid MIME type that has a charset parameter whose value is not a case-insensitive match for encoding, and encoding is not null, set all the charset parameters of that Content-Type header to encoding.

There is some wiggle-room for the User Agent to determine the encoding: set the AJAX-containing page's encoding to ISO-8859-1. The UA will then assume ISO for all form submission (unless the form otherwise specifies a different encoding) and likely AJAX submission, depending on interpretation of the W3C algorithm.

Ultimately, the only reliable solution is to set the page the visitor sees (with the AJAX on it) to ISO-8859-1, and then make sure to check it and convert to ISO on the back-end (you need to be sanitizing all user input before sending it to the database anyway, so just add this conversion to the process). There are plenty of library functions to do this in PHP or your given language. There's no way to guarantee conformance with the specs otherwise, so absolutely check/ensure the encoding on the back-end.



回答3:

I think I need to explain encoding and the charset parameter. These concern how the raw bytes sent over the network should be decoded.

For example, consider the content type application/x-www-form-urlencoded and the following data:

0x61253344254345254232

Because there was no charset (in fact, charset is illegal parameter for this content type...) ISO-8859-1 must be assumed. So decoding the above in ISO-8859-1 results:

"a%3D%CE%B2"

Now there is another format to decode (form urlencoded) which has its own rules. The current specs say that the percent encoding here must be UTF-8, so after doing string -> string transformation you get from the above:

"a=ß"

So as you can see, the format never uses characters other than ASCII so the charset doesn't really matter and is not supported anyway.


Your actual problem is unrelated to what encoding the percent encoding uses. Even if you defined a custom function that percent-encodes in ISO-8859-1, the server would still have to decode it on arrival and encode it for the database. You have nothing to gain from this.