I have time series data from 1974-2013 with a column for datetimeUTC (YYYY-MM-DD hh:mm +0000), and a column for the timezones in Olson format (e.g., Canada/Pacific, Canada/Eastern). I can convert the whole UTCdatetime column to a common timezone like this:
dataset$datetimeEST <- strptime(
dataset$datetimeUTC, format="%Y-%m-%d %H:%M:%S%z", tz="Canada/Eastern"
)
How do I convert datetimeUTC
to datetimeLOCAL
, given the corresponding timezone in each row?
Let me back up a bit. I have data from across the country (6 timezones) formatted in ISO8601 representation for 1974-2013. The timestamps are in local standard time throughout the year (i.e. DST is disregarded even if civilian time in the region observes DST). I need to do datetime calculations which are probably safest to do in UTC time, so that's easy. But, I also need to pull data for specific civil time periods, taking into account DST, and do calculations and plots (e.g., all the data for rush hour at locations across all 6 timezones) for that subsetted data.
The datetimeCLOCKTIME that I calculated below appears to be doing what I want for plotting, but gives the wrong answer when doing datetime calculations because it stored the datetime in the timezone of my local machine without having actually converted the time. The solution offered by @thelatemail is what I'm looking for, but I haven't been able to get it to work in Windows on the test dataset for 2012 (see below). Also, I was using strptime which converts to POXITlt, and his solution is in POXITct. I'm new to R, so any help would be infinitely appreciated.
Test dataset:
dataset <- data.frame (timestampISO8601 = c("2012-04-25T22:00:00-08:00","2012-04-25T22:15:00-08:00","2012-04-25T22:30:00-08:00","2012-04-25T22:45:00-08:00","2012-04-25T23:00:00-08:00","2012-04-25T23:15:00-08:00","2012-04-25T23:30:00-08:00","2012-04-25T23:45:00-08:00","2012-04-26T00:00:00-08:00","2012-04-26T00:15:00-08:00","2012-04-26T00:30:00-08:00","2012-04-26T00:45:00-08:00","2012-04-26T01:00:00-08:00","2012-04-26T01:15:00-08:00","2012-04-26T01:30:00-08:00","2012-04-26T01:45:00-08:00","2012-04-26T02:00:00-08:00","2012-04-25T22:00:00-03:30","2012-04-25T22:15:00-03:30","2012-04-25T22:30:00-03:30","2012-04-25T22:45:00-03:30","2012-04-25T23:00:00-03:30","2012-04-25T23:15:00-03:30","2012-04-25T23:30:00-03:30","2012-04-25T23:45:00-03:30","2012-04-26T00:00:00-03:30","2012-04-26T00:15:00-03:30","2012-04-26T00:30:00-03:30","2012-04-26T00:45:00-03:30","2012-04-26T01:00:00-03:30","2012-04-26T01:15:00-03:30","2012-04-26T01:30:00-03:30","2012-04-26T01:45:00-03:30","2012-04-26T02:00:00-03:30"), olson = c("Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland"), value = c(0,0,1,2,5,11,17,19,20,19,17,11,5,2,1,0,0,-3,-3,-2,-1,2,8,14,16,17,16,14,8,2,-1,-2,-3,-3), stringsAsFactors=FALSE)
Remove the ":" from the UTC offset. (R is expecting the format nnnn for the UTC offset):
dataset$timestampR<- paste(substr(dataset$timestampISO8601,1,22),substr(dataset$timestampISO8601,24,25),sep="")
When converting to UTC time, R defaults to -ve for the UTC offset, making -ve offsets in the timestamps positive:
dataset$datetimeUTC <- strptime(dataset$timestampR, format="%Y-%m-%dT%H:%M:%S%z", tz="UTC")
When converting to MACHINE time like this, R reads the input time and converts it to the time in the timezone of the local machine - in my case, this is Canada/Eastern:
dataset$datetimeMACHINE <- strptime(dataset$timestampR, format="%Y-%m-%dT%H:%M:%S%z")
When converting to CLOCKTIME time like this, R reads the input time and assigns the time zone of the local machine (currently EDT on my machine) without doing any time conversions:
dataset$datetimeCLOCKTIME <- strptime(dataset$timestampR,format="%Y-%m-%dT%H:%M:%S")
See the structure of the dataset:
str(dataset)
Plotting behaviours are different
library(ggplot2)
qplot(data=dataset,x=datetimeUTC,y=value)
qplot(data=dataset,x=datetimeMACHINE,y=value)
qplot(data=dataset,x=datetimeCLOCKTIME,y=value)
Calculation results differ. Incorrect calculation result for datetimeCLOCKTIME:
range (dataset$datetimeUTC)
range (dataset$datetimeMACHINE)
range (dataset$datetimeCLOCKTIME)
dataset$datetimeUTC[34] - dataset$datetimeUTC[1]
dataset$datetimeMACHINE[34] - dataset$datetimeMACHINE[1]
dataset$datetimeCLOCKTIME[34] - dataset$datetimeCLOCKTIME[1]