pd.to_datetime is getting half my dates with flipp

2020-02-01 17:08发布

My dataset has dates in the European format, and I'm struggling to convert it into the correct format before I pass it through a pd.to_datetime, so for all day < 12, my month and day switch. Is there an easy solution to this?

import pandas as pd
import datetime as dt
df = pd.read_csv(loc,dayfirst=True)
df['Date']=pd.to_datetime(df['Date'])

Is there a way to force datetime to acknowledge that the input is formatted at dd/mm/yy?

Thanks for the help!

Edit, a sample from my dates:

renewal["Date"].head()
Out[235]: 
0    31/03/2018
2    30/04/2018
3    28/02/2018
4    30/04/2018
5    31/03/2018
Name: Earliest renewal date, dtype: object

After running the following:

renewal['Date']=pd.to_datetime(renewal['Date'],dayfirst=True)

I get:

Out[241]: 
0    2018-03-31  #Correct
2    2018-04-01   #<-- this number is wrong and should be 01-04 instad
3    2018-02-28   #Correct

2条回答
家丑人穷心不美
2楼-- · 2020-02-01 17:29

Add format.

df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y') 
查看更多
何必那么认真
3楼-- · 2020-02-01 17:52

You can control the date construction directly if you define separate columns for 'year', 'month' and 'day', like this:

import pandas as pd
df = pd.DataFrame(
    {'Date': ['01/03/2018', '06/08/2018', '31/03/2018', '30/04/2018']}
)
date_parts = df['Date'].apply(lambda d: pd.Series(int(n) for n in d.split('/')))
date_parts.columns = ['day', 'month', 'year']
df['Date'] = pd.to_datetime(date_parts)

date_parts
#    day  month  year
# 0    1      3  2018
# 1    6      8  2018
# 2   31      3  2018
# 3   30      4  2018

df
#         Date
# 0 2018-03-01
# 1 2018-08-06
# 2 2018-03-31
# 3 2018-04-30
查看更多
登录 后发表回答