I'm inserting some data into a database from a form. I'm using addslashes
to escape the text (have also tried mysql_real_escape_string
with the same result).
Regular quotes are escaped, but some other quotes are not. For example, the string:
Homer's blood becomes the secret ingredient in Moe’s new beer.
is converted to:
Homer\'s blood becomes the secret ingredient in Moe’s new beer.
I didn't think the curly quote would matter unescaped, but only this text is inserted into the database:
Homer's blood becomes the secret ingredient in Moe
So PHP thinks the curly quote is fine, but MySQL is losing the string. MySQL is not giving any errors though.
I would look for a mismatch between the character encoding used in your Web interface and that used at the database level. If your Web interface uses UTF-8, for example, and your database is using the default MySQL encoding of latin1
, then you need to set up your tables with DEFAULT CHARSET=utf8
.
Use mysql_real_escape_string()
or mysqli, by the way. addslashes()
is NOT adequate protection against SQL injection.
The ’ in Moe’s is the only character in your example string that wouldn't be valid if that string is latin1 encoded but your mysql server expects utf8.
Simple demonstration:
<?php
function foo($s) {
echo 'len=', strlen($s), ' ';
for($i=0; $i<strlen($s); $i++) {
printf('%02X ', ord($s[$i]));
}
echo "\n";
}
// my file is latin1 encoded and so is the string literal
foo('Moe’s');
// now try it with an utf8 encoded string
foo( utf8_encode('Moe’s') );
prints
len=5 4D 6F 65 92 73
len=6 4D 6F 65 C2 92 73
Therefore the question is: Do you feed the mysql server something in a "wrong" encoding?
Each connection has a connection charset and the mysql server expects your client (php script) to send data that is encoded in that character set. You can find out what the connection charset is with
SHOW VARIABLES LIKE '%character%'
like in
$mysql = mysql_connect('..', '..', '..') or die(mysql_error());
mysql_select_db('..', $mysql) or die(mysql_error());
$query = "SHOW VARIABLES like '%character%'";
$result = mysql_query($query, $mysql) or die(__LINE__.mysql_error());
while( false!==($row=mysql_fetch_array($result, MYSQL_ASSOC)) ) {
echo join(', ', $row), "\n";
}
This should print something like
character_set_client, utf8
character_set_connection, utf8
character_set_database, latin1
character_set_filesystem, binary
character_set_results, utf8
character_set_server, utf8
character_set_system, utf8
and character_set_connection, utf8
indicates that "my" connection character set is utf8, i.e. the mysql server expects utf8 encoded characters from the client (php). What's "your" connection charset?
Then take a look at the actual encoding of your parameter string, i.e. if you had
$foo = mysql_real_escape_string($_POST['foo'], $mysql);
replace that by
echo '<div>Debug hex($_POST[foo])=';
for($i=0; $i<strlen($s); $i++) {
printf('%02X ', ord($_POST['foo'][$i]));
}
echo "</div>\n";
$foo = mysql_real_escape_string($_POST['foo'], $mysql);
and check what the actual encoding of your input string is. Does it print 92 or C2 92?