CSV: how to include double byte characters

2020-04-10 04:09发布

Have to generate a CSV file that includes double byte characters (Chinese, Japanese), the CSV file opens and the text reads correctly when use a text editor.

but the generated CSV file will show garbage text when opened in Excel, what did I miss?

标签: excel csv
1条回答
\"骚年 ilove
2楼-- · 2020-04-10 04:29

Unfortunately you don't miss something. It is Microsoft which is not able handling CSV files with unicode properly if you simply opening them with Excel.

If Excel saves CSV files, it uses not unicode encoding but per default other ISO encodings dependent of the Office language version. Not only that unicode is not the default, although is stand of the art in 21 century, it is furthermore not even possible to use unicode while saving CSV with Excel. The only file format which can save unicode is Unicode Text (*.txt). But this is a tabulator delimited text format instead of CSV.

So also if Excel is opening CSV files, it will not assume unicode in it. Instead it will assume the same default encoding it would use while saving CSV. Thats why the garbage characters occur if there is unicode in the CSV.

There is one exception. If the CSV is UTF-8 encoded and there is a UTF-8 BOM at the beginning of the file and the delimiter is the default delimiter, then Excel can open this CSV properly.

But there is also a Text Import Wizard. If you are using this, you can determine the encoding in step 1 with File origin. 65001 : Unicode (UTF-8) will be UTF-8. This wizard should be able to import all CSV files properly.

查看更多
登录 后发表回答