I'm writing a php program that pulls from a database source. Some of the varchars have quotes that are displaying as black diamonds with a question mark in them (�, REPLACEMENT CHARACTER, I assume from Microsoft Word text).
How can I use php to strip these characters out?
Just Paste This Code In Starting to The Top of Page.
For global purposes.
Instead of converting, codifying, decodifying each text I prefer to let them as they are and instead change the server php settings. So,
Edit your php.ini and add:
default_charset = "ISO-8859-1"
or instead of ISO-8859 the one which fits your text encoding.
Add this function to your variables utf8_encode($your variable);
Go to your phpmyadmin and select your database and just increase the length/value of that table's field to 500 or 1000 it will solve your problem.
Using the same charset (as suggested here) in both the database and the HTML has not worked for me... So remembering that the code is generated as HTML, I chose to use the
"
(HTML code) or the"
(ISO Latin-1 code) in my database text where quotes were used. This solved the problem while providing me a quotation mark. It is odd to note that prior to this solution, only some of the quotation marks and apostrophes did not display correctly while others did, however, the special code did work in all instances.If you see that character (� U+FFFD "REPLACEMENT CHARACTER") it usually means that the text itself is encoded in some form of single byte encoding but interpreted in one of the unicode encodings (UTF8 or UTF16).
If it were the other way around it would (usually) look something like this: ä.
Probably the original encoding is ISO-8859-1, also known as Latin-1. You can check this without having to change your script: Browsers give you the option to re-interpret a page in a different encoding -- in Firefox use "View" -> "Character Encoding".
To make the browser use the correct encoding, add an HTTP header like this:
or put the encoding in a meta tag:
Alternatively you could try to read from the database in another encoding (UTF-8, preferably) or convert the text with
iconv()
.