Convert multiple irregular time series into regula

2019-08-21 11:12发布

问题:

I have a data.frame of multiple irregular time series (data.frame) which looks like this

station   Time     WaterTemp
1       01-01-1974  5.0000000
1       01-02-1974  5.0000000
1       01-03-1974  8.6000004
1       01-05-1974  8.1333332
1       01-07-1974  12.7999999
2       01-01-1974  5.0000000
2       01-02-1974  5.0000000
2       01-04-1974  8.6000004
2       01-06-1974  8.1333332
2       01-08-1974  12.7999999

I want to convert this into regular time series (ts) object which should look like this

Time        Staion1     Station2
 Jan1974    5.0000000  5.0000000
 Feb1974    5.0000000  5.0000000
 Mar1974    8.6000004  NA
 Apr1974    NA         8.6000004
 May1974    8.1333332  NA
 June1974   NA         8.1333332
 July1974   12.7999999 NA
  Aug1974  NA         12.7999999
  Sep1974  NA         NA
  Oct1974  7.9         NA
  Nov1974  NA         NA
  Dec1974  NA         7.4

How do I do that? Although there are lots of solutions for a single time series, but I haven't come across one dealing with multiple time series.

Thanks,

回答1:

If DF is your data frame then try this. Converting to ts in the last line makes it regular and then we convert back to zoo:

library(zoo)
z <- read.zoo(DF, split = 1, index = 2, format = "%d-%m-%Y")
z.ym <- aggregate(z, as.yearmon, identity) # convert to yearmon
zm <- aggregate(as.zoo(as.ts(z.ym)), as.yearmon, identity)

An alternative to the last line would be these two lines:

g <- zoo(, seq(start(z.ym), end(z.ym), deltat(z.ym))) # grid
zm <- merge(z.ym, g)

In either case, at this point coredata(zm) is the data part and time(zm) is the index although you might want to keep it as a zoo object so that you can use its other time series facilities and the many other packages which accept time series of that form.

Note: Here is a complete self-contained reproducible example:

DF <- structure(list(station = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L), Time = structure(c(1L, 2L, 3L, 5L, 7L, 1L, 2L, 4L, 6L, 8L
), .Label = c("01-01-1974", "01-02-1974", "01-03-1974", "01-04-1974", 
"01-05-1974", "01-06-1974", "01-07-1974", "01-08-1974"), class = "factor"), 
    WaterTemp = c(5, 5, 8.6000004, 8.1333332, 12.7999999, 5, 
    5, 8.6000004, 8.1333332, 12.7999999)), .Names = c("station", 
"Time", "WaterTemp"), class = "data.frame", row.names = c(NA, 
-10L))

library(zoo)
z <- read.zoo(DF, split = 1, index = 2, format = "%d-%m-%Y")
z.ym <- aggregate(z, as.yearmon, identity) # convert to yearmon
zm <- aggregate(as.zoo(as.ts(z.ym)), as.yearmon, identity)

giving:

> zm
                 1         2
Jan 1974  5.000000  5.000000
Feb 1974  5.000000  5.000000
Mar 1974  8.600000        NA
Apr 1974        NA  8.600000
May 1974  8.133333        NA
Jun 1974        NA  8.133333
Jul 1974 12.800000        NA
Aug 1974        NA 12.800000

Updated Some corrections and improvements.