Javascript export CSV encoding utf-8 issue

2020-01-29 07:37发布

问题:

I need to export javascript array to CSV file and download it. I did it but 'ı,ü,ö,ğ,ş' this characters looks like 'ı ü ö ÄŸ ÅŸ' in the CSV file. I have tried many solutions recommended on this site but didn't work for me.

I added my code snippet, Can anyone solve this problem?

var csvString = 'ı,ü,ö,ğ,ş';

var a = window.document.createElement('a');
a.setAttribute('href', 'data:text/csv; charset=utf-8,' + encodeURIComponent(csvString));
a.setAttribute('download', 'example.csv');
a.click();

回答1:

This depends on what program is opening the example.csv file. Using a text editor, the encoding will be UTF-8 and the characters will not be malformed. But using Excel the default encoding for CSV is ANSI and not UTF-8. So without forcing Excel using not ANSI but UTF-8 as the encoding, the characters will be malformed.

Excel can be forced using UTF-8 for CSV with putting a BOM (Byte Order Mark) as first characters in the file. The default BOM for UTF-8 is the byte sequence 0xEF,0xBB,0xBF. So one could think simply putting "\xEF\xBB\xBF" as first bytes to the string will be the solution. But surely that would be too simple, wouldn't it? ;-) The problem with this is how to force JavaScript to not taking those bytes as characters. The "solution" is using a "universal BOM" "\uFEFF" as mentioned in Special Characters (JavaScript).

Example:

var csvString = 'ı,ü,ü,ğ,ş';
var universalBOM = "\uFEFF";
var a = window.document.createElement('a');
a.setAttribute('href', 'data:text/csv; charset=utf-8,' + encodeURIComponent(universalBOM+csvString));
a.setAttribute('download', 'example.csv');
window.document.body.appendChild(a);
a.click();

See also Adding UTF-8 BOM to string/Blob.

Using this, the encoding will be correct. But nevertheless, this only works properly if comma is the default list separator in your Windows locale settings. If not, if for example semicolon is the default list separator in your Windows locale settings, then all content will be in first column without splitting it by comma. Then you have to use semicolon as delimiter in the CSV also. But this is another problem and leads to the conclusion not using CSV at all but using libraries which can directly creating Excel files (*.xls or *.xlsx).