I'm importing excel file, where the 'Date' column has different ways of writing:
Date
13/03/2017
13/03/2017
13/03/2017
13/03/2017
10/3/17
10/3/17
9/3/17
9/3/17
9/3/17
9/3/17
Importing to pandas:
df = pd.read_excel('data_excel.xls')
df.Date = pd.to_datetime(df.Date)
results in:
Date
13/03/2017
64 13/03/2017
65 13/03/2017
66 13/03/2017
67 2017-10-03 00:00:00
68 2017-10-03 00:00:00
69 2017-09-03 00:00:00
70 2017-09-03 00:00:00
71 2017-09-03 00:00:00
72 2017-09-03 00:00:00
Which means, pandas did not parse properly date and time:
10/3/17 -> 2017-10-03
when I tried to specify the format:
df.Date = pd.to_datetime(df.Date, format='%d%m%Y')
got the error:
ValueError: time data u'13/03/2017' does not match format '%d%m%Y' (match)
Question:
How to import properly date and times from the excel file to pandas?
New answer:
Actually
pd.to_datetime
has adayfirst
keyword argument that is useful here:Result:
Old answer:
Use the third-party module
dateutil
which can handle these kinds of variations. It has adayfirst
keyword argument that is useful here:Result: