My question is about how to manage the dates and times in an air quality database, which saved data every ten minutes all day, every day from 2002 through 2008.
I want to generate several analysis and plots, but referring only to the morning peak hours which go from 6:00 through 8:00 a.m. I have tried to generate the diagrams in the needed interval but the R tool always plots the 24 hours in a day distorting, therefore, the available data for the peak hours.
I would hugely appreciate your guidance on how to select and plot interval in the peak hour only and how to generate the several diagrams.
I have the next script to generate a date interval, but I want to agregate hour interval (6-8 am) and plot only the interval data:
# select interval
start.date = as.POSIXct("2007-03-27 05:00", tz = "GMT")
end.date = as.POSIXct("2007-05-27 05:00", tz = "GMT")
subdata = subset(mydata, date >= start.date & date <= end.date,
select = c(date, nox, co))
#
#plot the variables
If your date-times are in a column called 'dtm' then this code should get the records that are within the interval 6A to 8A
You could also use an indexed approach with "[[", but if you have NA's they would get dragged along unless you specifically excluded them.
I recommend you use a time series class instead of a data.frame. Subsetting by a time interval each day is easy with xts:
If this was a data.frame, I would start by extracting the time of day for each entry into a new column and then tag each line with a "peak" flag, and then working with it becomes much easier. Ditto for day of week. Since there are only about 350k rows, this is going to be reasonably quick and it's a one-off, so you could do something ugly like:
Now you can easily select out only those records that are from peak hours - which will be a much smaller subset to graph - less than 50k records.
As I'm sure you are going to be doing many such analyses with different hypotheses, I'd suggest that you add various columns such as hour of day, day of week etc. to your data.frame and leave them there, and just save this big data.frame like: