Wordpress uses UTF-8, and the rest of my site uses

2019-09-08 06:53发布

问题:

Our client's site uses: ISO-8859-1 - on its main site UTF-8 on its "/blog/" directory for it's Wordpress blog, using a template that uses UTF-8 encoding.

This is fine, but on our main site, we also use the Wordpress API functions such as get_the_excerpt() to get the latest news from the blog, and display it on our home page. The problem is that some MS-Word characters seem to be special characters which display fine on the blog, but display like this on our home page:

Key Brand â€“ test

I tried changing my meta character encoding to UTF-8, but it didn't help. Instead, this PHP code works:

htmlentities($except_text, 1, "UTF-8", 0)

Even though I encode it from UTF-8, it works fine on my ISO-8859-1 template. I'm not too experienced on the character-encoding side of things, and I'll go ahead with the above fix, but I just want to know if anyone can explain why the above works and why changing my character encoding didn't work? The character itself is valid (e.g. the - hyphen in Word and the 'quotes' generated in Word).

[UPDATE] Actually, it doesn't work fine. The above also goes ahead and converts my "read more" link to a readable < a href > tag - i.e. the HTML is actually converted :( Any ideas how I can fix this?

Thanks, Rishi

回答1:

htmlentities will convert non-ASCII characters to HTML entities - ’ etc., which will then be interpreted correctly regardless of whether the client is expecting latin1 or utf8.

mb_convert_encoding($excerpt_text, "ISO-8859-1", "UTF-8") is probably what you need to do the conversion. If the WP blog contains non-latin1 characters, you're SOL of course.