i am beginning with a string containing an encoded unicode character "& #xfc;". I pass the string to an object that performs some logic and returns another string. That string is converting the original encoded character to its unicode equivalent "ü".
I need to get the original encoded character back but so far am not able.
I have tried using the HttpUtility.HtmlEncode() method but that is returning "& #252;" which is not the same.
Can anyone help?
I just had to sort this out yester day.
It's a bit more complicated than just looking at a single character. You need to roll your own HtmlEncode() method. Strings in the .Net world are UTF-16 encoded. Unicode codepoints (what an HTML numeric character reference identifies) are a 32-bit unsigned integer value. This is mostly an issue is you have to deal with characters outside Unicodes "basic multi-lingual plane".
This code should do what you want
Hope this helps!
They are pretty much the same, at least for display purposes.
HttpUtility.HtmlEncode
is using decimal encoding, which is in the format&#DECIMAL;
while your original version is in hexadecimal encoding, i.e. in the format&#xHEX;
. Sincefc
in hex is252
in decimal, the two are equivalent.If you really need to get the hex-encoded version, then consider parsing out the decimal and converting it to hex before stuffing it back in to the
&#xHEX;
format. Something likeOr you can try this code: