Inserting Hebrew text into MySQL using PHP (garbag

2019-07-17 06:33发布

I'm facing a weird problem with inserting hebrew text into mysql.

Basically the problem is :
I have a php script which picks up hebrew text from a csv file then send it to mysql database. The charset of both database and all fields of tables are set to UTF8 and collation to utf8_bin. But when I insert it using mysql, random garbage value appears inside the text which renders it completely useless for output. NOTE : I can still see half of the words appear correctly.

Here is my homework which might help you in understanding :
1. As I mentioned the table charset and collation are utf8.
2. I've send header('Content-Type: text/html; charset=utf-8')
3. If I echo out the text, it appears perfectly. When I convert it using utf-8_encode it get converted properly. (eg. שי יפת get converted to ×©× ×פת)
4. When I use utf-8_decode on the converted variable and use echo, it still displays perfectly.
5. I've used these after mysql_connect

mysql_query("SET character_set_client = 'utf8';");
mysql_query("SET character_set_result = 'utf8';");
mysql_query("SET NAMES 'utf8'");
mysql_set_charset('utf8');

and even tried this :
mysql_query("SET character_set_results = 'utf8', character_set_client = 'utf8', character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8'", $con)

  1. Added default_charset = "UTF-8" in my php.ini file.
  2. I am unaware of the encoding used in csv file but when I open it with notepad++ the encoding is utf-8 without BOM.
  3. Here is a sample of the actual garbage :
    original text : שי יפת
    text after utf8_encode : ×©× ×פת
    text after utf8_decode in same script : שי יפת (perfect)
    text send to mysql database : ש×? ×?פת (notice the ? in between)
    text if we echo from mysql : ש�? �?פת (the output is close)
  4. Used addslashes and stripslashes before utf8_encoding. (even tried after no luck)
  5. Server is on windows running xamp 1.7.4
    • Apache 2.2.17
    • MySQL 5.5.8 (Community Server)
    • PHP 5.3.5 (VC6 X86 32bit)

EDIT 1 : Just to clarify that I did searched the site for similar questions and did implemented the suggestions found (SET NAME UTF8 and alot other options etc) but it didn't work out. So please don't mark this question as repeat.

EDIT 2 : Here is the full script :

    <?php
header('Content-Type: text/html; charset=utf-8'); 

if (isset($_GET['filename'])==true)
{
$databasehost = "localhost";
$databasename = "what_csv";


$databaseusername="root";
$databasepassword="";
$databasename= "csv";

$fieldseparator = "\n";
$lineseparator = "@contact\n";


$csvfile = $_GET['filename'];
/********************************/


if(!file_exists($csvfile)) {
    echo "File not found. Make sure you specified the correct path.\n";
    exit;
}

$file = fopen($csvfile,"r");

if(!$file) {
    echo "Error opening data file.\n";
    exit;
}

$size = filesize($csvfile);

if(!$size) {
    echo "File is empty.\n";
    exit;
}

$csvcontent = fread($file,$size);

fclose($file);

$con = @mysql_connect($databasehost,$databaseusername,$databasepassword) or die(mysql_error());

mysql_query( "SET NAMES utf8" );
mysql_set_charset('utf8',$con);
/*
mysql_query("SET character_set_client = 'utf8';"); 
mysql_query("SET character_set_result = 'utf8';");

mysql_query("SET NAMES 'utf8'");
mysql_set_charset('utf8');

mysql_query("SET character_set_results = 'utf8', character_set_client = 'utf8', character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8'", $con);
*/

@mysql_select_db($databasename) or die(mysql_error());



$lines = 0;
$queries = "";
$linearray = array();

foreach(explode($lineseparator,$csvcontent) as $line) {

$Name="";
$Landline1="";
$Landline2="";
$Mobile="";
$Address="";
$Email="";
$IMEI="temp";
$got_imei=false;

//echo $line.'<br>';
    $lines++;

    $line = trim($line," \t");

    $line = str_replace("\r","",$line);

    $linearray = explode($fieldseparator,$line);
    //check for values to insert
    foreach($linearray as $field)
    {
    if (is_numeric($field)){ $got_imei=true;$IMEI=trim($field);}
    if (stristr($field, 'Name:')) {$Name=trim(str_replace("Name:", "", $field));}   
    if (stristr($field, 'Landline:')) {$Landline1=trim(str_replace("Landline:", "", $field));}  
    if (stristr($field, 'Landline2:')) {$Landline2=trim(str_replace("Landline2:", "", $field));}    
    if (stristr($field, 'Mobile:')) {$Mobile=trim(str_replace("Mobile:", "", $field));} 
    if (stristr($field, 'Address:')) {$Address=trim(str_replace("Address:", "", $field));}
    if (stristr($field, 'Email:')) {$Email=trim(str_replace("Email:", "", $field));}



    }
    if ($got_imei==true)
    {

    $query = "UPDATE $databasetable SET imei=$IMEI where imei='temp'";
        mysql_query($query);

    }



    else if (($Name=="") &&  ($Landline1=="" ) && ($Landline2=="")  && ($Mobile=="")  && ($Address=="")) {echo "";}
    else
    {
        //$Name = utf8_encode("$Name");
        //$Name = addslashes("$Name");
        $Name = utf8_encode(mysql_real_escape_string("$Name"));

        echo"$Name,$Landline1,$Landline2,$Address,$IMEI<br>";
        $query = "insert into $databasetable (imei, name, landline1, landline2, mobile, address, email) values('$IMEI','$Name', '$Landline1','$Landline2','$Mobile', '$Address', '$Email');";
        mysql_query($query);
        $Name = utf8_decode(($Name));   
        echo $Name."<br>";

    }
}
@mysql_close($con);



echo "Found a total of $lines records in this csv file.\n";

}
?>


<form>
Enter file name <input type="text" name="filename" /><br />
<input type="submit" value="Submit" /><br>
NOTE : File must be present in same directory as this script. Please include full filename, for example filename.csv.
</form>

Here is a sample of csv file :

@contact
Name: שי יפת
Mobile: 0547939898

@IMEI
355310042074173

EDIT 3 :

If I directly enter the string via cmd I get this warning:

Warning Code : 1366
Incorrect string value: '\xD7\xA9\xD7\x99 \xD7...' for column 'name' at row 1

Here is something I found on the net that could be related, any help? http://bugs.mysql.com/bug.php?id=30131

2条回答
唯我独甜
2楼-- · 2019-07-17 07:12

I had this problem too. Thees lines solve it:

mysql_query( "SET NAMES utf8" );
mysql_query( "SET CHARACTER SET utf8" );

Shana Tova

查看更多
家丑人穷心不美
3楼-- · 2019-07-17 07:25

Use Text/LongText instead of varchar. Also use Collation as utf8_general_ci

Hope this will help you @Ajit

查看更多
登录 后发表回答