In python pandas, I have a dataset that looks like this:
For data before 2007-04-26 17:00:00, the time zone is US/Eastern. For data after, the time zone is American/Chicago.
When I run this:
data.index = data[:'2007-04-26 16:59:59'].index.tz_localize('US/Eastern', ambiguous = 'NaT').tz_convert('Europe/London')
I get an error that says:
NonExistentTimeError: 2006-04-02 02:00:00
This is indeed because of daylight savings time. I have the same problem for 2007. I don't have the problem for subsequent years. Ideally, I'd like two commands - one that converts the first half of the dataset from Eastern to London, and another that converts the second half from Chicago to London.
Since that didn't work, I tried dropping these times (an hour I believe), eg. 02:00:00 to 03:00:00 where there was daylight savings time. However, when I run
data.drop(data.ix['2005-04-03 2:00:00':'2005-04-03 3:00:00'], inplace=True)
I get
ValueError: labels ['open' 'high' 'low' 'close' 'volume'] not contained in axis
Does anyone know how I can simply convert these times? Any help would be greatly appreciated.
Thanks, Alex
UPDATE to add more info:
Ok I've used the following code which has worked to drop the offending times:
UPDATE 2:
mask = ((data.index<datetime.strptime("2006-04-02 02:00:00","%Y-%m-%d %H:%S:%M")) | (data.index>datetime.strptime("2006-04-02 03:00:00","%Y-%m-%d %H:%S:%M"))) & ((data.index<datetime.strptime("2005-04-03 02:00:00","%Y-%m-%d %H:%S:%M")) | (data.index>datetime.strptime("2005-04-03 03:00:00","%Y-%m-%d %H:%S:%M"))) & ((data.index<datetime.strptime("2005-10-30 01:00:00","%Y-%m-%d %H:%S:%M")) | (data.index>datetime.strptime("2005-10-30 02:00:00","%Y-%m-%d %H:%S:%M"))) & ((data.index<datetime.strptime("2006-10-29 01:00:00","%Y-%m-%d %H:%S:%M")) | (data.index>datetime.strptime("2006-10-29 02:00:00","%Y-%m-%d %H:%S:%M")))
data_filtered = data[mask]
data_filtered.ix = data_filtered.tz_localize('US/Eastern', infer_dst=True).tz_convert('Europe/London')
But now I get this error:
data_filtered.ix = data_filtered.tz_localize('US/Eastern', infer_dst=True).tz_convert('Europe/London')
Traceback (most recent call last):
File "<ipython-input-38-0fc8a9e68588>", line 1, in <module>
data_filtered.ix = data_filtered.tz_localize('US/Eastern', infer_dst=True).tz_convert('Europe/London')
File "C:\Anaconda\lib\site-packages\pandas\core\generic.py", line 1955, in __setattr__
object.__setattr__(self, name, value)
AttributeError: can't set attribute
Any ideas on this? I did some Googling but couldn't find anything really related..
Your drop command doesn't look like it should work based on the description in the docs. To get rid of the offending times, I would create a mask on the dataframe, ie:
Probably there's a way to make drop work too. Check this related question: Deleting rows of daylight saving time from a time indexed pandas dataframe