Is it possible to force Excel recognize UTF-8 CSV

2018-12-31 05:01发布

I'm developing a part of an application that's responsible for exporting some data into CSV files. The application always uses UTF-8 because of its multilingual nature at all levels. But opening such CSV files (containing e.g. diacritics, cyrillic letters, Greek letters) in Excel does not achieve the expected results showing something like Г„/Г¤, Г–/Г¶. And I don't know how to force Excel understand that the open CSV file is encoded in UTF-8. I also tried specifying UTF-8 BOM EF BB BF, but Excel ignores that.

Is there any workaround?

P.S. Which tools may potentially behave like Excel does?


UPDATE

I have to say that I've confused the community with the formulation of the question. When I was asking this question, I asked for a way of opening a UTF-8 CSV file in Excel without any problems for a user, in a fluent and transparent way. However, I used a wrong formulation asking for doing it automatically. That is very confusing and it clashes with VBA macro automation. There are two answers for this questions that I appreciate the most: the very first answer by Alex https://stackoverflow.com/a/6002338/166589, and I've accepted this answer; and the second one by Mark https://stackoverflow.com/a/6488070/166589 that have appeared a little later. From the usability point of view, Excel seemed to have lack of a good user-friendly UTF-8 CSV support, so I consider both answers are correct, and I have accepted Alex's answer first because it really stated that Excel was not able to do that transparently. That is what I confused with automatically here. Mark's answer promotes a more complicated way for more advanced users to achieve the expected result. Both answers are great, but Alex's one fits my not clearly specified question a little better.


UPDATE 2

Five months later after the last edit, I've noticed that Alex's answer has disappeared for some reason. I really hope it wasn't a technical issue and I hope there is no more discussion on which answer is greater now. So I'm accepting Mark's answer as the best one.

标签: excel csv utf-8
25条回答
与君花间醉酒
2楼-- · 2018-12-31 05:28

The bug with ignored BOM seems to be fixed for Excel 2013. I had same problem with Cyrillic letters, but adding BOM character \uFEFF did help.

查看更多
临风纵饮
3楼-- · 2018-12-31 05:28

This is an old question but I've just encountered had a similar problem and the solution may help others:

Had the same issue where writing out CSV text data to a file, then opening the resulting .csv in Excel shifts all the text into a single column. After having a read of the above answers I tried the following, which seems to sort the problem out.

Apply an encoding of UTF-8 when you create your StreamWriter. That's it.

Example:

using (StreamWriter output = new StreamWriter(outputFileName, false, Encoding.UTF8, 2 << 22)) {
   /* ... do stuff .... */
   output.Close();
}
查看更多
爱死公子算了
4楼-- · 2018-12-31 05:28

hi i'm using ruby on rails for csv generation. In our application we plan to go for the multi language(I18n) and we faced an issue while viewing I18n content in the CSV file of windows excel.

Was fine with Linux (Ubuntu) and mac.

We identified that windows excel need to be imported the data again to view the actual data. While import we will get more options to choose character set.

But this can’t be educated for each and every user, so solution we looking for is to open just by double click.

Then we identified the way of showing data by open mode and bom in windows excel with the help of aghuddleston gist. Added at reference.

Example I18n content

In Mac and Linux

Swedish : Förnamn English : First name

In Windows

Swedish : Förnamn English : First name

def user_information_report(report_file_path, user_id)
    user = User.find(user_id)
    I18n.locale = user.current_lang
    open_mode = "w+:UTF-16LE:UTF-8"
    bom = "\xEF\xBB\xBF"
    body user, open_mode, bom
  end

def headers
    headers = [
        "ID", "SDN ID",
        I18n.t('sys_first_name'), I18n.t('sys_last_name'), I18n.t('sys_dob'),
        I18n.t('sys_gender'), I18n.t('sys_email'), I18n.t('sys_address'),
        I18n.t('sys_city'), I18n.t('sys_state'), I18n.t('sys_zip'),
        I18n.t('sys_phone_number')
    ]
  end

def body tenant, open_mode, bom
    File.open(report_file_path, open_mode) do |f|
      csv_file = CSV.generate(col_sep: "\t") do |csv|
        csv << headers
        tenant.patients.find_each(batch_size: 10) do |patient|
          csv <<  [
              patient.id, patient.patientid,
              patient.first_name, patient.last_name, "#{patient.dob}",
              "#{translate_gender(patient.gender)}", patient.email, "#{patient.address_1.to_s} #{patient.address_2.to_s}",
              "#{patient.city}", "#{patient.state}",  "#{patient.zip}",
              "#{patient.phone_number}"
          ]
        end
      end
      f.write bom
      f.write(csv_file)
    end
  end

Important things to note here is open mode and bom

open_mode = "w+:UTF-16LE:UTF-8"

bom = "\xEF\xBB\xBF"

Before writing the CSV insert BOM

f.write bom

f.write(csv_file)

Windows and Mac

File can be opened directly by double clicking.

Linux (ubuntu)

While opening a file ask for the separator options -> choose “TAB” enter image description here

查看更多
看风景的人
5楼-- · 2018-12-31 05:30

Simple vba macro for opening utf-8 text and csv files

Sub OpenTextFile()

   filetoopen = Application.GetOpenFilename("Text Files (*.txt;*.csv), *.txt;*.csv")
   If filetoopen = Null Or filetoopen = Empty Then Exit Sub

   Workbooks.OpenText Filename:=filetoopen, _
   Origin:=65001, DataType:=xlDelimited, Comma:=True

End Sub

Origin:=65001 is UTF-8. Comma:True for .csv files distributed in colums

Save it in Personal.xlsb to have it always available. Personalise excel toolbar adding a macro call button and open files from there. You can add more formating to the macro, like column autofit , alignment,etc.

查看更多
素衣白纱
6楼-- · 2018-12-31 05:32

First save the Excel spreadsheet as Unicode text. Open the TXT file using Internet explorer and click "Save as" TXT Encoding - choose the appropriate encoding, i.e. for Win Cyrillic 1251

查看更多
弹指情弦暗扣
7楼-- · 2018-12-31 05:33

Old question but heck, the simplest solution is:

  1. Open CSV in Notepad
  2. Save As -> select the right encoding
  3. Open the new file
查看更多
登录 后发表回答