Our client's site uses: ISO-8859-1 - on its main site UTF-8 on its "/blog/" directory for it's Wordpress blog, using a template that uses UTF-8 encoding.
This is fine, but on our main site, we also use the Wordpress API functions such as get_the_excerpt() to get the latest news from the blog, and display it on our home page. The problem is that some MS-Word characters seem to be special characters which display fine on the blog, but display like this on our home page:
Key Brand – test
I tried changing my meta character encoding to UTF-8, but it didn't help. Instead, this PHP code works:
htmlentities($except_text, 1, "UTF-8", 0)
Even though I encode it from UTF-8, it works fine on my ISO-8859-1 template. I'm not too experienced on the character-encoding side of things, and I'll go ahead with the above fix, but I just want to know if anyone can explain why the above works and why changing my character encoding didn't work? The character itself is valid (e.g. the - hyphen in Word and the 'quotes' generated in Word).
[UPDATE] Actually, it doesn't work fine. The above also goes ahead and converts my "read more" link to a readable < a href > tag - i.e. the HTML is actually converted :( Any ideas how I can fix this?
Thanks, Rishi
htmlentities will convert non-ASCII characters to HTML entities -
’
etc., which will then be interpreted correctly regardless of whether the client is expecting latin1 or utf8.mb_convert_encoding($excerpt_text, "ISO-8859-1", "UTF-8")
is probably what you need to do the conversion. If the WP blog contains non-latin1 characters, you're SOL of course.