R - UTC to LOCAL time given Olson timezones

2019-02-19 18:46发布

问题:

I have time series data from 1974-2013 with a column for datetimeUTC (YYYY-MM-DD hh:mm +0000), and a column for the timezones in Olson format (e.g., Canada/Pacific, Canada/Eastern). I can convert the whole UTCdatetime column to a common timezone like this:

dataset$datetimeEST <- strptime(
  dataset$datetimeUTC, format="%Y-%m-%d %H:%M:%S%z", tz="Canada/Eastern"
)  

How do I convert datetimeUTC to datetimeLOCAL, given the corresponding timezone in each row?

Let me back up a bit. I have data from across the country (6 timezones) formatted in ISO8601 representation for 1974-2013. The timestamps are in local standard time throughout the year (i.e. DST is disregarded even if civilian time in the region observes DST). I need to do datetime calculations which are probably safest to do in UTC time, so that's easy. But, I also need to pull data for specific civil time periods, taking into account DST, and do calculations and plots (e.g., all the data for rush hour at locations across all 6 timezones) for that subsetted data.

The datetimeCLOCKTIME that I calculated below appears to be doing what I want for plotting, but gives the wrong answer when doing datetime calculations because it stored the datetime in the timezone of my local machine without having actually converted the time. The solution offered by @thelatemail is what I'm looking for, but I haven't been able to get it to work in Windows on the test dataset for 2012 (see below). Also, I was using strptime which converts to POXITlt, and his solution is in POXITct. I'm new to R, so any help would be infinitely appreciated.

Test dataset:

    dataset <- data.frame (timestampISO8601 = c("2012-04-25T22:00:00-08:00","2012-04-25T22:15:00-08:00","2012-04-25T22:30:00-08:00","2012-04-25T22:45:00-08:00","2012-04-25T23:00:00-08:00","2012-04-25T23:15:00-08:00","2012-04-25T23:30:00-08:00","2012-04-25T23:45:00-08:00","2012-04-26T00:00:00-08:00","2012-04-26T00:15:00-08:00","2012-04-26T00:30:00-08:00","2012-04-26T00:45:00-08:00","2012-04-26T01:00:00-08:00","2012-04-26T01:15:00-08:00","2012-04-26T01:30:00-08:00","2012-04-26T01:45:00-08:00","2012-04-26T02:00:00-08:00","2012-04-25T22:00:00-03:30","2012-04-25T22:15:00-03:30","2012-04-25T22:30:00-03:30","2012-04-25T22:45:00-03:30","2012-04-25T23:00:00-03:30","2012-04-25T23:15:00-03:30","2012-04-25T23:30:00-03:30","2012-04-25T23:45:00-03:30","2012-04-26T00:00:00-03:30","2012-04-26T00:15:00-03:30","2012-04-26T00:30:00-03:30","2012-04-26T00:45:00-03:30","2012-04-26T01:00:00-03:30","2012-04-26T01:15:00-03:30","2012-04-26T01:30:00-03:30","2012-04-26T01:45:00-03:30","2012-04-26T02:00:00-03:30"), olson = c("Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Pacific","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland","Canada/Newfoundland"), value = c(0,0,1,2,5,11,17,19,20,19,17,11,5,2,1,0,0,-3,-3,-2,-1,2,8,14,16,17,16,14,8,2,-1,-2,-3,-3), stringsAsFactors=FALSE)

Remove the ":" from the UTC offset. (R is expecting the format nnnn for the UTC offset):

    dataset$timestampR<- paste(substr(dataset$timestampISO8601,1,22),substr(dataset$timestampISO8601,24,25),sep="")

When converting to UTC time, R defaults to -ve for the UTC offset, making -ve offsets in the timestamps positive:

    dataset$datetimeUTC <- strptime(dataset$timestampR, format="%Y-%m-%dT%H:%M:%S%z", tz="UTC")

When converting to MACHINE time like this, R reads the input time and converts it to the time in the timezone of the local machine - in my case, this is Canada/Eastern:

    dataset$datetimeMACHINE <- strptime(dataset$timestampR, format="%Y-%m-%dT%H:%M:%S%z")

When converting to CLOCKTIME time like this, R reads the input time and assigns the time zone of the local machine (currently EDT on my machine) without doing any time conversions:

    dataset$datetimeCLOCKTIME <- strptime(dataset$timestampR,format="%Y-%m-%dT%H:%M:%S")

See the structure of the dataset:

    str(dataset)  

Plotting behaviours are different

    library(ggplot2)
    qplot(data=dataset,x=datetimeUTC,y=value)
    qplot(data=dataset,x=datetimeMACHINE,y=value)
    qplot(data=dataset,x=datetimeCLOCKTIME,y=value)

Calculation results differ. Incorrect calculation result for datetimeCLOCKTIME:

    range (dataset$datetimeUTC)
    range (dataset$datetimeMACHINE)
    range (dataset$datetimeCLOCKTIME)

    dataset$datetimeUTC[34] - dataset$datetimeUTC[1]
    dataset$datetimeMACHINE[34] - dataset$datetimeMACHINE[1]
    dataset$datetimeCLOCKTIME[34] - dataset$datetimeCLOCKTIME[1]

回答1:

You could format back and forth a bit to get a local time representation in a character format. E.g.:

dataset <- data.frame(
  datetimeUTC=c("2014-01-01 00:00 +0000","2014-01-01 00:00 +0000"),
  olson=c("Canada/Eastern", "Canada/Pacific"),
  stringsAsFactors=FALSE
)

#             datetimeUTC          olson
#1 2014-01-01 00:00 +0000 Canada/Eastern
#2 2014-01-01 00:00 +0000 Canada/Pacific

dataset$localtime <- with(dataset, 
     mapply(function(dt,ol) format(
              as.POSIXct(dt,"%Y-%m-%d %H:%M %z",tz=ol),
              "%Y-%m-%d %H:%M %z"), 
              datetimeUTC, olson
            )
     )

#             datetimeUTC          olson              localtime
#1 2014-01-01 00:00 +0000 Canada/Eastern 2013-12-31 19:00 -0500
#2 2014-01-01 00:00 +0000 Canada/Pacific 2013-12-31 16:00 -0800


回答2:

If you have only two time zones to convert to and know the difference in time between UTC and those two. Using @thelatemail's dataset

transform(dataset, 
localtime=as.POSIXct(datetimeUTC, "%Y-%m-%d %H:%M %z")-
           c(5*3600,8*3600)[as.numeric(factor(olson))])
 #            datetimeUTC          olson           localtime
#1 2014-01-01 00:00 +0000 Canada/Eastern 2013-12-31 19:00:00
#2 2014-01-01 00:00 +0000 Canada/Pacific 2013-12-31 16:00:00