I have the following dataframe:
user_id purchase_date
1 2015-01-23 14:05:21
2 2015-02-05 05:07:30
3 2015-02-18 17:08:51
4 2015-03-21 17:07:30
5 2015-03-11 18:32:56
6 2015-03-03 11:02:30
and purchase_date
is a datetime64[ns]
column. I need to add a new column df[month]
that contains first day of the month of the purchase date:
df['month']
2015-01-01
2015-02-01
2015-02-01
2015-03-01
2015-03-01
2015-03-01
I'm looking for something like DATE_FORMAT(purchase_date, "%Y-%m-01") m
in SQL. I have tried the following code:
df['month']=df['purchase_date'].apply(lambda x : x.replace(day=1))
It works somehow but returns: 2015-01-01 14:05:21
.
Simpliest and fastest is convert to
numpy array
byvalues
and then cast:Another solution with
floor
andpd.offsets.MonthBegin(0)
:Last solution is create
month period
byto_period
:... and then to
datetimes
byto_timestamp
, but it is a bit slowier:There are many solutions, so:
Timings:
For me
df['purchase_date'] - pd.offsets.MonthBegin(1)
didn't work (it fails for the first day of the month), so I'm subtracting the days of the month like this:Try this ..
We can use date offset in conjunction with Series.dt.normalize:
Or much nicer solution from @BradSolomon