I'm a beginner with R and have tried searching for data extraction for certain time periods but can't seem to find anything.
I have a time series of continuous data measured at 10 minute intervals for a period of five months. For simplicity's sake, the data is available in two columns as follows:
Timestamp Temp.Diff
2/14/2011 19:00 -0.385
2/14/2011 19:10 -0.535
2/14/2011 19:20 -0.484
2/14/2011 19:30 -0.409
2/14/2011 19:40 -0.385
2/14/2011 19:50 -0.215
... And it goes on for the next five months. I have read the Timestamp column using as.POSIXct() into R.
Assuming that only certain times of the day are of interest to me, (e.g. from 12 noon to 3 PM), I would like either like to exclude the other hours of the day, OR just extract those 3 hours but still have the data flow sequentially (i.e. in a time series). I understand that you can easily subset data if you know the row numbers, but as this is a much larger dataset, is there a way to code R so it automatically recognises the time period I'm looking at?
You seem to know the basic idea, but are just missing the details. As you mentioned, we just transform the Timestamps into POSIX objects then subset.
lubridate Solution
The easiest way is probably with lubridate. First load the package:
library(lubridate)
Next convert the timestamp:
##*m*onth *d*ay *y*ear _ *h*our *m*inute
d = mdy_hm(dd$Timestamp)
Then we select what we want. In this case, I want any dates after 7:30pm (regardless of day):
dd[hour(d) == 19 & minute(d) > 30 | hour(d) >= 20,]
Base R solution
First create an upper limit:
lower = strptime("2/14/2011 19:30","%m/%d/%Y %H:%M")
Next transform the Timestamps in POSIX objects:
d = strptime(dd$Timestamp, "%m/%d/%Y %H:%M")
Finally, a bit of dataframe subsetting:
dd[format(d,"%H:%M") > format(lower,"%H:%M"),]
Thanks to plannapus for this last part
Data for the above example:
dd = read.table(textConnection('Timestamp Temp.Diff
"2/14/2011 19:00" -0.385
"2/14/2011 19:10" -0.535
"2/14/2011 19:20" -0.484
"2/14/2011 19:30" -0.409
"2/14/2011 19:40" -0.385
"2/14/2011 19:50" -0.215'), header=TRUE)
You can do this with easily with the time-based subsetting in the xts package. Assuming your data.frame is named Data
:
library(xts)
x <- xts(Data$Temp.Diff, Data$Timestamp)
y <- x["T12:00/T15:00"]
# you need the leading zero if the hour is a single digit
z <- x["T09:00/T12:00"]