Applying models to multiple time-series

2019-04-12 00:18发布

问题:

Let's say I have multiple time series for which I want forecasts. If I have the appropriate time-series object for each, I could fit (for the sake of example) an ARIMA model and so on. But, I know there must be an easy way to automate this process when all of the series are in one xts object (leaving aside the fact that different variables might require different ARIMA models; that's probably a question for another time).

Some sample data as an xts object (daily revenue for six different businesses):

library(xts)

ts <- structure(c(534L, 549L, 636L, 974L, 848L, 895L, 1100L, 1278L, 
1291L, 1703L, 1532L, 533L, 619L, 642L, 939L, 703L, 759L, 1213L, 
1195L, 1153L, 1597L, 1585L, 649L, 597L, 628L, 924L, 703L, 863L, 
1261L, 1161L, 1212L, 1616L, 1643L, 583L, 694L, 611L, 891L, 730L, 
795L, 1242L, 1210L, 1159L, 1501L, 1702L, 513L, 532L, 580L, 917L, 
978L, 947L, 1227L, 1253L, 1121L, 1697L, 1569L, 646L, 636L, 516L, 
869L, 980L, 937L, 1173L, 1203L, 1204L, 1511L, 1640L), .Dim = c(11L, 
6L), .Dimnames = list(NULL, c("Americas_Globe", "Americas_Lucky", 
"Americas_Star", "Asia_Star", "EuroPac_Globe", "EuroPac_Lucky"
)), index = structure(c(1367384400, 1367470800, 1367557200, 1367643600, 
1367730000, 1367816400, 1367902800, 1367989200, 1368075600, 1368162000, 
1368248400), tzone = "", tclass = c("POSIXlt", "POSIXt")), .indexCLASS = c("POSIXlt", 
"POSIXt"), tclass = c("POSIXlt", "POSIXt"), .indexTZ = "", tzone = "", class = c("xts", 
"zoo"))

I can extract one time-series from this object...

ts.amerglob <- ts[,1] #Extract the "Americas_Global company time-series

and model it however (for the sake of example, fit an ARIMA model):

ts.ag.arima <- arima(ts.amerglob, order=c(0,1,1))

and make forecasts

ts.ag.forecasts <- forecast.Arima(ts.ag.arima, h=5)

But what if I want to do this for each of the 6 companies in this ts object?

When fitting standard regression models, I've used by() to do something similar with subsets of the data. But applying that methodology here doesn't seem to work:

co.arima <- by(ts, ts[,1:6],
    function(x) arima(x, order=c(1,0,1)))

returns an error about sequence length:

error in tapply(seq_len(11L), list(INDICES = c(534L, 549L, 636L, 974L,  : 
  arguments must have same length

Is there any easy way to apply a time-series model to multiple time-series at once and extract relevant information? Ultimately what I'm looking to do is put the forecasts for each of these time series into one data.frame or matrix (but it would be great to be able to do the same thing with intermediate steps in the modeling process, such as auto.arima() output for each time-series)...

回答1:

Simply use lapply here:

res <- lapply(dat.ts,arima,order=c(1,0,1))

If you want to use different order parameter for each time serie, you can use Map or mapply:

## generate a random list of orders
orders <- lapply(seq_len(ncol(dat.ts)),function(x)sample(c(0,1),3,rep=T))
## for each serie compute its arima with its corresponding order
Map(function(x,ord)arima(x,ord),as.list(dat.ts),orders)

EDIT get order using auto.arima fom forecast package:

Note I am rarely use this package, so I am not sure of the final result. I show here just the idea of using lapply:

orders <- lapply(dat.ts,function(x){
             mod <- auto.arima(x)
             mod$arma[c(1, 6, 2, 3, 7, 4, 5)][1:3]
 })
$Americas_Globe
[1] 0 1 0
$Americas_Lucky
[1] 0 1 0
$Americas_Star
[1] 0 1 1
$Asia_Star
[1] 0 1 0
$EuroPac_Globe
[1] 0 1 0
$EuroPac_Lucky
[1] 0 1 0


标签: r xts