I used the hts package in R to fit an HTS model on train data, used "arima" option to forecast and computed the accuracy on the holdout/test data. Here is my code:
library(hts)
data<-read.csv("C:/TS.csv")
ts_train <- ts(data[,-1],frequency=12, start=c(2000,1))
hts_train <- hts(ts_train, nodes=list(2, c(4, 2)))
data.test<-read.csv("C:/TStest.csv")
ts_test <- ts(data.test[,-1],frequency=12, start=c(2003,1))
hts_test <- hts(ts_test, nodes=list(2, c(4, 2)))
forecast <- forecast(hts_train, h=15, method="bu", fmethod="arima", keep.fitted = TRUE, keep.resid = TRUE)
accuracy<-accuracy.gts(forecast, hts_test)
Now, let's suppose I'm happy with the accuracy on the holdout sample and I'd like to lump the test data back with the train data and re-forecast using the full set.
I tried using this code:
data.full<-read.csv("C:/TS_full.csv")
ts_full <- ts(data.full[,-1],frequency=12, start=c(2000,1))
hts_full <- hts(ts_full, nodes=list(2, c(4, 2)))
forecast.full <- forecast(hts_full, h=15, method="bu", fmethod="arima", keep.fitted = TRUE, keep.resid = TRUE)
Now, I'm not sure that this is really the right way to do it as I don't know if ARIMA models that were used to estimate my train data are the same ARIMA models that I'm now using to forecast the full data set (I presume fmethod="arima" utilizes auto.arima) . I'd like them to remain the same models, otherwise the models evaluated by my out of sample accuracy measures are different from the models I used for the final forecast.
I see there is a FUN argument that represents "a user-defined function that returns an object which can be passed to the forecast function". Perhaps that argument can be used in the last line of my code somehow to make sure the models I fit on the train data are used to forecast the full data set?
Any suggestions on what sort of R code would help would be much appreciated.