Im trying to convert some special characters like ä
,ö
,ü
,α
,μ
,α
,ο
,ι
, and others from a webpage. When I download the page with the ASIHTTPRequest i get some codes instead of the character itself. Examples:
ä = \u00E4
μ = \u03BC
α = \u03B1
This also happens if I use [NSString stringWithContentsOfURL:aNSURL encoding:NSASCIIStringEncoding error:nil];
I have tried different encodings available but none of them work for the above example. For example: With the NSUnicodeStringEncoding
I get some strange like 'chinese' characters and with NSASCIIStringEncoding
I get these numbers&letters.
The strange thing is, if I look in the source code, in a web browser like safari, of the webpage, it's all fine, with the normal HTML character entity like: ä = ä
Is there any way to convert these encoded letters back?
Thanks
EDIT
Sorry, that I forgot to mention the source code of a browser above.
I just noticed on this site: link that the hex HTML Entity is very similar to what I have got with tis code. Examples:
ä = ä
μ = μ
α = α
As you can maybe see, they are very similar. Just lowercase and the 0
's are replaced with one x
, and at the beginning add &#
, to the end a ;
.
I will just have to write some small code to convert the numbers&letters to the hex entities, not going to be a big problem. Then just have to use an HTML entity convertor and done.
Anyway, thanks a lot for helping me out again
Sean
You can use the found at this link. It uses a built in method from the CFXML parser. It describes the code below
Alternatively you can use
NSString* sI = (NSString*)CFXMLCreateStringByUnescapingEntities(NULL, (CFStringRef)s, NULL);
which is available depending on which OS you are building for.Also you can check this out and use it: https://github.com/mwaterfall/MWFeedParser/blob/master/Classes/NSString+HTML.m
Check using this method:
After having another try with Rob Mayoffs code it worked! Here is the link to his answer:
Converting escaped UTF8 characters back to their original form