Getting U+fffd/65533 instead of special character

2020-03-07 12:37发布

问题:

I have a C# .net web project that have a globalization tag set to:

<globalization requestEncoding="utf-8" responseEncoding="utf-8" culture="nb-no" uiCulture="no"/>

When this URL a Flash application (you get the same problem when you enter the URL manually in a browser): c_product_search.aspx?search=kjøkken (alternatively: c_product_search-aspx?search=kj%F8kken

Both return the following character codes:

k U+006b 107
j U+006a 106
� U+fffd 65533
k U+006b 107
k U+006b 107
e U+0065 101
n U+006e 110

I don't know too much about character encoding, but it seems that the ø has been given a unicode replacement character, right?

I tried to change the globalization tag to:

<globalization requestEncoding="iso-8859-1" responseEncoding="utf-8" culture="nb-no" uiCulture="no"/>

That made the request work. However, now, other searches on my page stopped working.

I also tried the following with similar results:

NameValueCollection qs = HttpUtility.ParseQueryString(Request.QueryString.ToString(), Encoding.GetEncoding("iso-8859-1"));
string search = (string)qs["search"];

What should I do?

Kind Regards,

nitech

回答1:

The problem comes from the combination Firefox/Asp.Net. When you manually entered a URL in Firefox's address bar, if the url contains french or swedish characters, Firefox will encode the url with "ISO-8859-1" by default.

But when asp.net recieves such a url, it thinks that it's utf-8 encoded ... And encoded characters become "U+fffd". I couldn't find a way in asp.net to detect that the url is "ISO-8859-1". Request.Encoding is set to utf-8 ... :(

Several solutions exist :

  • put <globalization requestEncoding="iso-8859-1" responseEncoding="iso-8859-1"/> in your Web.config. But your may comme with other problems, and your application won't be standard anymore (it will not work with languages like japanese) ... And anyway, I prefer using UTF-8 !

  • go to about:config in Firefox and set the value of network.standard-url.encode-query-utf8 to true. It will now work for you (Firefox will encode all your url with utf-8). But not for anybody else ...

  • The least worst solution I could come with was to handle this with code. If the default decoding didn't work, we reparse QueryString with iso8859-1 :

    string query = Request.QueryString["search"];
    if (query.Contains("%ufffd"))
        query = HttpUtility.ParseQueryString(Request.Url.Query, Encoding.GetEncoding("iso-8859-1"))["search"];
    query = HttpUtility.UrlDecode(query);
    

It works with hyperlinks and manually-entered url, in french, english, or japanese. But I don't know how it will handle other encodings like ISO8859-5 (russian) ...

Does anyone have a better solution ?

This solves only the problem of manually-entered url. In your hyperlinks, don't forget to encode url parameters with HttpUtility.UrlEncode on the server, or encodeURIComponent on the javascript code. And use HttpUtility.UrlDecode to decode it.



回答2:

    public string GetEncodedQueryString(string key)
    {
        string query = Request.QueryString[key];
        if (query != null)
            if (query.Contains((char)0xfffd))
                query = HttpUtility.ParseQueryString(Request.Url.Query, Encoding.GetEncoding("iso-8859-1"))[key];
        return query;
    }


回答3:

i think your problem is in the flash, not the .net. it sends the special character in a weird way. try to urlencode the search string bevore you send it to the server.



回答4:

If the app is expecting the URL-encoded request to be based on UTF-8, the character "ø" should be "%C3%B8", not "%F8". Whatever function you're using to escape/encode that request, you probably need to pass it the name of the underlying character encoding, "UTF-8".



回答5:

It turns out that ActionScript 2.0 will send the URL encoded/escaped with UTF-8 while ActionScript 3.0 used ISO-8859-1. The way to solve this was to change the Request.Encoding value inside Global.asax if an encoding is specified in the URL:

void Application_BeginRequest(object sender, EventArgs e)
{
    HttpContext ctx = HttpContext.Current;

    // encoding specified?
    if (!String.IsNullOrEmpty(Request["encoding"]))
    {
        ctx.Request.ContentEncoding = System.Text.Encoding.GetEncoding(ctx.Request["encoding"]);
    }        
}

Could it be done differently?

Regards, nitech