Non-consecutive intraday index

2019-09-04 11:48发布

问题:

This question is related to : Python pandas, how to only plot a DataFrame that actually have the datapoint and leave the gap out

I'd like to know the easiest way to produce non-consecutive DateTimeIndex at intra-day resolution, that only maintains samples between certain [stock exchange] times e.g. 08:00-16:30, and has only given weekdays e.g. Mon-Fri. A bonus would be to be allow provision of a calendar of valid dates.

At the day range, it's easy to do with pandas.bdate_range() for Mon-Fri. What I'd like is something analogous at intraday e.g. second resolution, but doesn't include Saturday/Sunday.

The point of this is to be able to graph consecutive days of financial time series without 'gaps', while maintaining the labels. i.e. this:

vs the below (note that x labels are persisted, at the second resolution, although only dates are shown here - when you zoom in the time becomes visible):

This is not the only way to achieve this; see the linked questions for alternative suggestions (the easiest probably being to use the use_index=False parameter to pandas.Series.plot()). But this question is in reference to the creation of a non-consecutive DateTimeIndex; I'm not asking for alternatives solutions

回答1:

You could create a full intraday index and filter out nights and week-ends:

import pandas as pd
index = pd.date_range('2016-01-01', '2016-01-16', freq='1min')
index[(index.dayofweek <= 4) & (index.hour >= 8) & (index.hour <= 16)]

Output:

DatetimeIndex(['2016-01-01 08:00:00', '2016-01-01 08:01:00',
               '2016-01-01 08:02:00', '2016-01-01 08:03:00',
               '2016-01-01 08:04:00', '2016-01-01 08:05:00',
               '2016-01-01 08:06:00', '2016-01-01 08:07:00',
               '2016-01-01 08:08:00', '2016-01-01 08:09:00',
               ...
               '2016-01-15 16:50:00', '2016-01-15 16:51:00',
               '2016-01-15 16:52:00', '2016-01-15 16:53:00',
               '2016-01-15 16:54:00', '2016-01-15 16:55:00',
               '2016-01-15 16:56:00', '2016-01-15 16:57:00',
               '2016-01-15 16:58:00', '2016-01-15 16:59:00'],
              dtype='datetime64[ns]', length=5940, freq=None)

You could include a calendar by adding a condition to the mask:

import numpy as np
np.in1d(index.date, calendar)

where calendar would be a numpy array of datetime objects.