A problem with passing Japanese characters(UTF-8)

2020-03-30 04:35发布

I'm having a trouble transferring Japanese characters from PHP to JavaScript via json_encode.

Here is the raw data read from csv file.

PRODUCT1,QA,テスト
PRODUCT2,QA,aテスト
PRODUCT3,QA,1テスト

The problem is that when passing those data by echo json_encode($return_value), where $return_value is a 2-dimentional array containing above data, the Japanese word 'テスト' gets dropped and shown as empty string on the ajax response side. However, if I put any alphabetical/digital chars at the start of the Japanese word, like 'aテスト' or '1テスト' which are the 2nd and 3rd lines of the above example, those words get passed ok.

Below is how the data looks like on the ajax response side. As you can see, the 3rd element of the 1st block is empty. If I remove 'a' or '1' from the other words of the above raw data, those characters become empty on the response side too. This is happening to any kind of Japanese characters I have tested so far.

[["PRODUCT1","QA",""],["PRODUCT2","QA","a\u30c6\u30b9\u30c8"],["PRODUCT3","QA","1\u30c6\u30b9\u30c8"]]

Does anybody have any idea why this is happening and how I can overcome this problem?

Here is a part of the code from each side.

 PHP:
 function getFileContents($dirName,$filename){

    $return_value = array();
    $my_file= fopen($dirName . $filename, "r");

    $row = 0;
    while (($data = fgetcsv($my_file, 1000, ",")) !== FALSE) {
        $num = count($data);
         for ($c=0; $c < $num; $c++) {
            #csv file is written in euc-jp so convert to utf-8 here.
            $return_value[$row][$c] = mb_convert_encoding($data[$c], "UTF-8", "EUC-JP");
         }
         $row++;
    } 
    fclose($my_file);

    echo json_encode($return_value);
  }

  JavaScript:
  $.ajax({
     type: "POST",
     url: "data.php",
     data: { 
        "dirName" : "./data/",
        "filename" : filename
     },
     dataType :"json",
     success : function(response){
          // more code
          // At this point, Japanese characters are already empty strings.
    } 
  });

Thanks a lot for your help in advance!

1条回答
狗以群分
2楼-- · 2020-03-30 05:04

I found that the problem was PHP fgetcsv() function not being able to recognize the characters in EUC-JP. Apparently, fgetcsv() uses the system locale setting to make assumptions about character encoding. I have added below line before doing fgetcsv() as the referenced example shows(but in a reversed way), and it fixed the problem!

setlocale(LC_ALL, 'ja_JP.EUC-JP'); 
查看更多
登录 后发表回答