I am parsing a huge ascii file with dates assigned to entries. So, I found myself using datetime package in parallel to numpy.datetime64 to add array capabilities. I know that the pandas package is probably most recommended to be used for date, however try to pull this off without pandas. I have been looking around for a neat way to add/subtract a certain datestep like one year, or 3 month from a datetime64 object.
Currently, I am converting dt64 object to dt object and use replace function to change the year for example and have to convert it back to dt64 afterward which is a bit messy to me. So, I would appreciate if anyone has a better solution using only numpy.datetime64 format.
Example: Converting a "YYYY-12-31" to "(YYYY-1)-12-31"
a = np.datetime64(2014,12,31) # a is dt64 object
b = a.astype(object) # b is dt object converted from a
c = np.datetime64( b.replace(b.year-1)) # c is dt64 object shifted back 1 year (a -1year)
You can use the numpy.timedelta64 object to perform time delta calculations on a numpy.datetime64 object, see Datetime and Timedelta Arithmetic.
Since a year can be either 365 or 366 days, it is not possible to substract a year, but you could substract 365 days instead:
import numpy as np
np.datetime64('2014-12-31') - np.timedelta64(365,'D')
results in:
numpy.datetime64('2013-12-31')
How about:
import numpy as np
import pandas as pd
def numpy_date_add(vd_array,y_array):
ar=((vd_array.astype('M8[Y]') + np.timedelta64(1, 'Y') * \
y_array).astype('M8[M]')+ \
(vd_array.astype('M8[M]')- \
vd_array.astype('M8[Y]'))).astype('M8[D]')+ \
(vd_array.astype('M8[D]')-\
vd_array.astype('M8[M]'))
return ar
# usage
valDate=pd.datetime(2016,12,31)
per=[[0,3,'0-3Yr'],
[3,7,'3-7Yrs'],
[7,10,'7-10Yrs'],
[10,15,'10-15Yrs'],
[15,20,'15-20Yrs'],
[20,30,'20-30Yrs'],
[30,40,'30-40Yrs'],
[40,200,'> 40Yrs']]
pert=pd.DataFrame(per,columns=['start_period','end_period','mat_band'])
pert['valDate']=valDate
pert['startdate'] = numpy_date_add(pert.valDate.values,pert.start_period.values)
pert['enddate'] = numpy_date_add(pert.valDate.values,pert.end_period.values)
print(pert)
Is vector based pandas usage and I think it deals with leap years.