I have code which imports many text documents, containing foreign/special characters, into an Excel workbook:
Sub loadfiles()
Dim fpath As String
Dim fname As String
Application.ScreenUpdating = False
fpath = "...\data\"
fname = Dir(fpath & "*.txt")
For i = 1 To 10
Application.StatusBar = True
Application.StatusBar = "Progress: " & i & " of 10000"
Sheet1.Select
Range("A" & i).Value = fname
With ActiveSheet.QueryTables.Add(Connection:="TEXT;" _
& fpath & fname, Destination:=Range("B" & i))
.Name = "a"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = False
.RefreshOnFileOpen = False
.RefreshStyle = xlInsertDeleteCells
.SaveData = True
.AdjustColumnWidth = False
.RefreshPeriod = 0
.TextFilePromptOnRefresh = False
.TextFilePlatform = 437
.TextFileStartRow = 1
.TextFileParseType = xlDelimited
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = False
.TextFileColumnDataTypes = _
Array(xlTextFormat, xlSkipColumn, xlGeneralFormat)
.Refresh BackgroundQuery:=False
fname = Dir
End With
Next i
Application.StatusBar = False
Application.ScreenUpdating = True
MsgBox "Done"
End Sub
Is there any way to import text without losing original characters?
Try adding
.QueryType = xlTextImport
and changing
.TextFilePlatform = xlMSDOS
.PreserveFormatting = True
Instead of VBA a quick approach could be as follows.
1.Save the imported file as a csv
2.Open Excel
3.Import the data using Data-->Import External Data --> Import Data
4.Select the file type of "csv" and browse to your file
5.In the import wizard change the File_Origin to Select "Japanese shift-jis encoding" (or choose correct language character identifier)
6.Change the Delimiter to comma
7.Select where to import to and Finish.
This way the special characters should show correctly.
I uderstand adding \uFEFF at the beginning of any CSV file (generated in Java), Excel is able to open them correctly. The UTF-8 Byte-order marker will clue Excel 2007+ in to the fact that you're using UTF-8 UTF-8 is a variable width encoding. It only requires 1 byte to encode ASCII characters, but other code points will use multiple bytes.
Microsoft Excel mangles Diacritics in .csv files?