Submit form with get method in non UTF-8 encoding

2019-09-03 14:04发布

问题:

  1. I have a page in non UTF-8 encoding.
  2. There is a form with method="get". If I submit these non UTF-8 characters, they get to the URI.
  3. When I try to run decodeURIComponent() on the URI, I get the infamous error: URIError: malformed URI sequence.

Please follow the testcase.

Questions:

  1. In which of the above steps (1, 2, 3) is the problem? 1 should be OK. 2 is the standard way to submit <form>. And 3 is the standard function. All these things are standard! Yet there must be an error somewhere.

  2. Are characters of encodings other than UTF-8 allowed in URIs? (If not, the problem apparently is in step 2).

  3. After the problem is correctly diagnosed, the question is - what would be a clean solution to it? The page must stay in the non UTF-8 encoding.

回答1:

Firefox says:

[11:38:39.275] A form was submitted in the ISO-8859-2 encoding which cannot encode all Unicode characters, so user input may get corrupted. To avoid this problem, the page should be changed so that the form is submitted in the UTF-8 encoding either by changing the encoding of the page itself to UTF-8 or by specifying accept-charset=utf-8 on the form element. @ http://artax.karlin.mff.cuni.cz/~ttel5535/pub/bugs/form_get_submit_non_utf-8/non_utf8_uri_test.html?input=%B9