I have been stuck for days on exporting UTF-8 CSV with chinese characters that shows garbled text on Windows Excel. I am using PHP and have already added the BOM byte mark and tried encoding but no luck at all.
They open fine on Notepad++, Google Spreadsheet and even on Mac Numbers. But not on Excel which is a requirement by the client. When opening with Notepad++ the encoding is shown as UTF-8. If I change it to UTF-8 manually and save, the file opens fine on Excel.
It seems as though the BOM byte mark doesn't get saved in the output as Notepad++ always detect it as UTF-8 without BOM.
Also, the CSV is not saved on server. Data is retrieved from DB and then exported directly out.
Here are my codes:
// Setup headers
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Content-Description: File Transfer');
header("Content-type: text/csv");
header("Content-disposition: filename=".$filename.".csv");
header("Pragma: no-cache");
// First Method
$fp = fopen('php://output', 'w');
// Add BOM to fix UTF-8 in Excel, but doesn't work
fputs($fp, chr(0xEF) . chr(0xBB) . chr(0xBF) );
if ($fp) {
fputcsv($fp, array("Header"), ",");
fputcsv($fp, array($string_with_chinese_chars), ",");
}
fclose($fp);
exit();
// Second Method
$csv = "";
$sep = ",";
$newline = "\n"; // Also tried with PHP_EOL
$csv .= "Header";
$csv .= $newline;
$csv .= $string_with_chinese_chars;
$csv .= $newline;
// Tried all the below ways but doesn't work.
// Method 2.1
print chr(255) . chr(254) . mb_convert_encoding($csv, 'UTF-16LE', 'UTF-8');
// Method 2.2
print chr(239) . chr(187) . chr(191) . $csv;
// Method 2.3
print chr(0xEF).chr(0xBB).chr(0xBF);
print $newline;
print $csv;
Hope this can help someone. What worked for me was I had to put both:
I'm not an expert in PHP so I can't explain why this works by I hope this helps someone because I had a hard time also solving this problem.
Below code worked for me. Output utf-8-bom characters before csv content:
I usually do it this way:
And I use the
";"
as seperator as excel most likley doesnt autoformat the","
Based on your comment above, it looks like your script is accidentally printing out a newline (hex
0A
) before the UTF-8 BOM, causing Excel not to recognize the output as UTF-8.Since you're using PHP, make sure that there's no empty line before the
<?php
marker in your script, or in any other PHP file that it might include. Also make sure that none of the files you include has any whitespace after the closing?>
marker, if there is one.In practice, this can be quite hard to do, since many text editors insist on always appending a newline to the end of the last line. Thus, the safest and easiest solution is to simply leave out the
?>
marker from your PHP files, unless you intend to print out whatever comes after it. PHP does not require the?>
to be present, and using it in files that are not meant to be mixed PHP and literal template HTML (or other text) is just asking for bugs like this.