I have an Excel file that has some Spanish characters (tildes, etc.) that I need to convert to a CSV file to use as an import file. However, when I do Save As CSV it mangles the "special" Spanish characters that aren't ASCII characters. It also seems to do this with the left and right quotes and long dashes that appear to be coming from the original user creating the Excel file in Mac.
Since CSV is just a text file I'm sure it can handle a UTF8 encoding, so I'm guessing it is an Excel limitation, but I'm looking for a way to get from Excel to CSV and keep the non-ASCII characters intact.
The only "easy way" of doing this is as follows. First, realize that there is a difference between what is displayed and what is kept hidden in the Excel .csv file.
(1) Open an Excel file where you have the info (.xls, .xlsx)
(2) In Excel, choose "CSV (Comma Delimited) (*.csv) as the file type and save as that type.
(3) In NOTEPAD (found under "Programs" and then Accessories in Start menu), open the saved .csv file in Notepad
(4) Then choose -> Save As..and at the bottom of the "save as" box, there is a select box labelled as "Encoding". Select UTF-8 (do NOT use ANSI or you lose all accents etc). After selecting UTF-8, then save the file to a slightly different file name from the original.
This file is in UTF-8 and retains all characters and accents and can be imported, for example, into MySQL and other database programs.
This answer is taken from this forum.
Another one I've found useful: "Numbers" allows encoding-settings when saving as CSV.
Excel typically saves a csv file as ANSI encoding instead of utf8.
One option to correct the file is to use Notepad or Notepad++:
Encoding -> Convert to Ansi will encode it in ANSI/UNICODE. Utf8 is a subset of Unicode. Perhaps in ANSI will be encoded correctly, but here we are talking about UTF8, @SequenceDigitale.
There are faster ways, like exporting as csv ( comma delimited ) and then, opening that csv with Notepad++ ( free ), then Encoding > Convert to UTF8. But only if you have to do this once per file. If you need to change and export fequently, then the best is LibreOffice or GDocs solution.
You can use iconv command under Unix (also available on Windows as libiconv).
After saving as CSV under Excel in the command line put:
(remember to replace cp1250 with your encoding).
Works fast and great for big files like post codes database, which cannot be imported to GoogleDocs (400.000 cells limit).
Easy way to do it: download open office (here), load the spreadsheet and open the excel file (
.xls
or.xlsx
). Then just save it as a text CSV file and a window opens asking to keep the current format or to save as a .ODF format. select "keep the current format" and in the new window select the option that works better for you, according with the language that your file is been written on. For Spanish language select Western Europe (Windows-1252/ WinLatin 1
) and the file works just fine. If you select Unicode (UTF-8
), it is not going to work with the spanish characters.