I'm writing a crawler which uses Hpricot. It downloads a list of strings from some webpage, then I try to write it to the file. Something is wrong with the encoding:
"\xC3" from ASCII-8BIT to UTF-8
I have items which are rendered on a webpage and printed this way:
Développement
the str.encoding
returns UTF-8
, so force_encoding('UTF-8')
doesn't help. How may I convert this to readable UTF-8?
"ruby 1.9: invalid byte sequence in UTF-8" described another good approach with less code:
Seems your string thinks it is UTF-8, but in reality, it is something else, probably ISO-8859-1.
Define (force) the correct encoding first, then convert it to UTF-8.
In your example:
An alternative is:
If the
Ã
makes no sense, then try another encoding.Your string seems to have been encoded the wrong way round: