Attempting to do a bin using pd.cut() but it is fairly elaborate-
A collegue sends me multiple files with report dates such as:
'03-16-2017 to 03-22-2017'
'03-23-2017 to 03-29-2017'
'03-30-2017 to 04-05-2017'
They are all combined into a single dataframe and given a column name, df['Filedate'] so that every record in the file has the correct filedate.
The last day is a cutoff point, so I created a new column df['Filedate_bin'] which converts the last day to 3/22/2017, 3/29/2017, 4/05/2017 as a string.
Then I created a list: Filedate_bin_list= df.Filedate_bin.unique(). As a result I have a unique list of string cutoff dates that I would like to use as bins.
Importing different data into dataframe, there is a column of transaction dates: 3/28/2017, 3/29/2017, 3/30/2017, 4/1/2017, 4/2/2017, etc. Assigning them to a bin is difficult, it tried:
df['bin'] = pd.cut(df.Processed_date, Filedate_bin_list)
Received TypeError: unsupported operand type for -: 'str' and 'str'
Went back and tried converting the Filedate_bin to datetime, format='%m/%d/%Y' and get
TypeError: Cannot cast ufunc less input from dtype('<m8[ns]') to dtype ('<m8') with casting rule 'same_kind'.
Is there a better way to bin my processed_date(s) to text bins?
Am trying to tie in my processed dates 3/27/2017 to '03-23-2017 to 03-29-2017'
Consider this approach:
Result:
Explanation:
df.Date.astype(np.int64)//10**9
- convertsdatetime
values into UNIX epoch (timestamp - # of seconds since1970-01-01 00:00:00
):the same will applyied to
bins
:labels: