How to create a time scatterplot with R?

2020-05-21 04:58发布

问题:

The data are a series of dates and times.

date time
2010-01-01 09:04:43
2010-01-01 10:53:59
2010-01-01 10:57:18
2010-01-01 10:59:30
2010-01-01 11:00:44
…

My goal was to represent a scatterplot with the date on the horizontal axis (x) and the time on the vertical axis (y). I guess I could also add a color intensity if there are more than one time for the same date.

It was quite easy to create an histogram of dates.

mydata <- read.table("mydata.txt", header=TRUE, sep=" ")
mydatahist <- hist(as.Date(mydata$day), breaks = "weeks", freq=TRUE, plot=FALSE)
barplot(mydatahist$counts, border=NA, col="#ccaaaa")
  1. I haven't figured out yet how to create a scatterplot where the axis are date and/or time.
  2. I would like also to be able to have axis not necessary with linear dates YYYY-MM-DD, but also based on months such as MM-DD (so different years accumulate), or even with a rotation on weeks.

Any help, RTFM URI slapping or hints is welcome.

回答1:

The ggplot2 package handles dates and times quite easily.

Create some date and time data:

dates <- as.POSIXct(as.Date("2011/01/01") + sample(0:365, 100, replace=TRUE))
times <- as.POSIXct(runif(100, 0, 24*60*60), origin="2011/01/01")

df <- data.frame(
  dates = dates,
  times = times
)

Then get some ggplot2 magic. ggplot will automatically deal with dates, but to get the time axis formatted properly use scale_y_datetime():

library(ggplot2)
library(scales)
ggplot(df, aes(x=dates, y=times)) + 
  geom_point() + 
  scale_y_datetime(breaks=date_breaks("4 hour"), labels=date_format("%H:%M")) + 
  theme(axis.text.x=element_text(angle=90))


Regarding the last part of your question, on grouping by week, etc: To achieve this you may have to pre-summarize the data into the buckets that you want. You can use possibly use plyr for this and then pass the resulting data to ggplot.



回答2:

I'd start by reading about as.POSIXct, strptime, strftime, and difftime. These and related functions should allow you to extract the desired subsets of your data. The formatting is a little tricky, so play with the examples in the help files.
And, once your dates are converted to a POSIX class, as.numeric() will convert them all to numeric values, hence easy to sort, plot, etc.

Edit: Andre's suggestion to play w/ ggplot to simplify your axis specifications is a good one.