I have been provided with a xlsb file full of data. I want to process the data using python. I can convert it to csv using excel or open office, but I would like the whole process to be more automated. Any ideas?
Update: I took a look at this question and used the first answer:
import subprocess
subprocess.call("cscript XlsToCsv.vbs data.xlsb data.csv", shell=False)
The issue is the file contains greek letters so the encoding is not preserved. Opening the csv with Notepad++ it looks as it should, but when I try to insert into a database comes like this ���. Opening the file as csv, just to read text is displayed like this: \xc2\xc5\xcb instead of ΒΕΛ.
I realize it's an issue in encoding, but it's possible to retain the original encoding converting the xlsb file to csv ?
The script you reference seem to use the ActiveX interface to Excel, and save via its
Workbook.SaveAs
method. According to the MSDN documentation this method have aTextCodepage
argument which may be helpful.Sidenote: You can rewrite the VB script in python, see this question.