I'm reading excel files and writing them out as csv. A couple of columns contain dates which are formatted as float number in excel. All those fields need to get converted to a proper datetime (dd/mm/YY) before I wrote to CSV.
I found some good articles on how that works in general, but struggling to get that working for all rows in a opened sheet at once. (Newbie in Python)
Code looks like below for now:
wb = xlrd.open_workbook(args.inname)
xl_sheet = wb.sheet_by_index(0)
print args.inname
print ('Retrieved worksheet: %s' % xl_sheet.name)
print outname
# TODO: Convert xldate.datetime from the date fileds to propper datetime
output = open(outname, 'wb')
wr = csv.writer(output, quoting=csv.QUOTE_ALL)
for rownum in xrange(wb.sheet_by_index(0).nrows):
wr.writerow(wb.sheet_by_index(0).row_values(rownum))
output.close()
I'm sure i have to change the "for rownum ...." line but I'm struggling doing it. I tried several options, which all failed.
thanks
You need to go through the row before you write it out to file, converting values. You are right to identify that it is near the for rownum
line:
# You need to know which columns are dates before hand
# you can't get this from the "type" of the cell as they
# are just like any other number
date_cols = [5,16,23]
... # Your existing setup code here #
# write the header row (in response to OP comment)
headerrow = wb.sheet_by_index(0).row_values(0)
wr.writerow(headerrow)
# convert and write the data rows (note range now starts from 1, not 0)
for rownum in xrange(1,wb.sheet_by_index(0).nrows):
# Get the cell values and then convert the relevant ones before writing
cell_values = wb.sheet_by_index(0).row_values(rownum)
for col in date_cols:
cell_values[col] = excel_time_to_string(cell_values[col])
wr.writerow(cell_values)
Exactly what you put in your excel_time_to_string()
function is up to you - the answer by @MarkRansom has a reasonable approach - or you could use the xlrd
own package versions outlined in this answer.
For instance:
def excel_time_to_string(xltimeinput):
return str(xlrd.xldate.xldate_as_datetime(xltimeinput, wb.datemode))
* EDIT *
In response to request for help in comments after trying. Here's a more error-proof version of excel_time_to_string()
def excel_time_to_string(xltimeinput):
try:
retVal = xlrd.xldate.xldate_as_datetime(xltimeinput, wb.datemode)
except ValueError:
print('You passed in an argument in that can not be translated to a datetime.')
print('Will return original value and carry on')
retVal = xltimeinput
return retVal
The conversion from Excel to Python is quite simple:
>>> excel_time = 42054.441953
>>> datetime.datetime(1899,12,30) + datetime.timedelta(days=excel_time)
datetime.datetime(2015, 2, 19, 10, 36, 24, 739200)
Or to do the complete conversion to a string:
def excel_time_to_string(excel_time, fmt='%Y-%m-%d %H:%M:%S'):
dt = datetime.datetime(1899,12,30) + datetime.timedelta(days=excel_time)
return dt.strftime(fmt)
>>> excel_time_to_string(42054.441953)
'2015-02-19 10:36:24'
>>> excel_time_to_string(42054.441953, '%d/%m/%y')
'19/02/15'