I've got a PHP script that is reading in some JSON data provided by a client. The JSON data provided had a single 'smart quote' in it.
Example:
{
"title" : "Lorem Ipsum’s Dolar"
}
In my script I'm using a small function to get the json data:
public function getJson($url) {
$filePath = $url;
$fh = fopen($filePath, 'r') or die();
$temp = fread($fh, filesize($filePath));
$temp = utf8_encode($temp);
echo $temp . "<br />";
$json = json_decode($temp);
fclose($fh);
return $json;
}
If I utf8 encode the data, when I echo it out I see nothing where the quote should be. If I don't utf8 encode the data, when I echo it out I see the funny question mark symbol �
Any thoughts on how to actually see the proper character??
Thanks!
Is it possibe that the server is sending the json data in an encoding like windows-1252? That codepage has some smart code characters where iso-8859 has control characters. Could you try to use iconv("windows-1252", "utf-8", $temp)
instead of utf8_encode
. Even better would be if the server already sends utf-8 encoded json, since that is the recommended encoding per rfc4627.
The issue is more on the side, that generates the JSON file. There you should escape the ' by \'
If you can't modify this part, you should do it like this with addslashes:
$temp = fread($fh, filesize($filePath));
$temp = utf8_encode($temp);
echo $temp . "<br />";
$temp = addslashes($temp);
$json = json_decode($temp);
Can you possibly do a string replace assuming the data is all utf8?
$text = str_replace($find, $replace, $text);
Looking for the characters below?
'“' // left side double smart quote
'â€' // right side double smart quote
'‘' // left side single smart quote
'’' // right side single smart quote
'…' // elipsis
'—' // em dash
'–' // en dash