I want to split xts/zoo
time-series in R
on weekly basis. The timezone is set to "Asia/Kolkata"
Sys.setenv(TZ="Asia/Kolkata")
library(xts)
seqs<- seq(as.POSIXct("2016-01-01"),as.POSIXct("2016-01-30"), by = "30 mins")
ob<- xts(data.frame(value=1:(length(seqs))),seqs)
weekdata <- split(ob,f="weeks",k=1)
The problem with this split
is that each week data is offset by 5:30 hours as shown below
> head(weekdata[[2]],2)
value
2016-01-04 05:30:00 156
2016-01-04 06:00:00 157
> head(weekdata[[3]],2)
value
2016-01-11 05:30:00 492
2016-01-11 06:00:00 493
I know it is due to timezone (5:30 hours for Asia/Kolkata). I also believe that this can be tuned by using endpoints
function, but I find it diffcult to fix. Can anyone provide some pointers?
So if I understand correctly your desired output is a list of xts where every element has data for one week.
You can do that with this:
Sys.setenv(TZ="Asia/Kolkata")
library(xts)
library(lubridate)
seqs = seq(as.POSIXct("2016-01-01"),as.POSIXct("2016-01-30"), by = "30 mins")
weeks <- week(seqs)
df <- data.frame(seqs, weeks)
ob <- xts(data.frame(value=1:(length(seqs))), seqs)
weekdata = lapply(unique(weeks), function(i){
ob[weeks == i]
})
Your code is perfectly fine, you just have to add a timezone parameter ("UTC" or “GMT”, which is equivalent) to the 3rd line in your code above and you do not have to change the timezone environment variable , which is always dangerous in case you forget to reset the variable. No need for conversions from df
to xts
etc.
seqs<- seq(as.POSIXct("2016-01-01 00:00:00","UTC"),as.POSIXct("2016-01-30 00:00:00","UTC"), by = "30 mins”)
> both(weekdata[[2]])
value
2016-01-04 00:00:00 145
2016-01-04 00:30:00 146
2016-01-04 01:00:00 147
value
2016-01-10 22:30:00 478
2016-01-10 23:00:00 479
2016-01-10 23:30:00 480
In case your current time zone is not “UTC” you will get a warning making you aware of this fact.