I'm connecting to an external API using cfhttp, with the returned data in XML format. I have no control over the API or the format it's returned in.
When the data is returned, I loop through it and do cfquery inserts into my own MySQL database, which has a UTF8 charset.
However, some of the data appears to have unicode characters (it appears it should be the £ (pound) sign, but when I cfdump the XMLParsed data, it's showing as a diamond with a ? inside). I've attached a cropped screenshot showing part of the cfdump showing this;
The problem is the cfquery insert - when it gets to those characters, it's returning this error;
Error Executing Database Query.
Incorrect string value: '\xEF\xBF\xBD10 ...' for column 'voucherTitle' at row 1
I've tried setting the charset in the cfhttp call, but get the same result.
Is there any way I can either encode/decode these, or alternatively trim them out altogether (the data gets edited further down the line anyway, so manually adding the correct symbols isn't a huge issue).
UPDATE: As of MySQL 5.5.3, there is also UTF8mb4 which is often recommended over UTF8.
(From the comments)
I recall something similar on another thread. Double check the collation and character set for that column using the INFORMATION_SCHEMA.COLUMNS view:
If it is not UTF-8, you can change it using the ALTER TABLE command. Modify the column size
M
as needed.NB: If the data is important, always make a backup of the table before applying any modifications.
See also: 11.1.15 Character Sets and Collations Supported by MySQL