I am new to Python ARIMA implementation. I have a data at 15 min frequency for few months. In my attempt to follow the Box-Jenkins method to fit a timeseries model. I ran into an issue towards the end. The ACF-PACF graph for the time series (ts) and the difference series (ts_diff) are given. I used ARIMA (5,1,2) and finally I plotted the fitted values(green) and original values(blue). As you can from figure, there is a clear shift(by one) in values. What am I doing wrong?
Is the prediction bad? Any insight will be helpful.
This is a standard property of one-step ahead prediction or forecasting.
The information used for the forecast is the history up to and including the previous period. A peak, for example, at a period will affect the forecast for the next period, but cannot influence the forecast for the peak period. This makes the forecasts appear shifted in the plot.
A two-step ahead forecast would give the impression of a shift by two periods.
Just to confirm, I am doing this right then? Here is the code I used.
from statsmodels.tsa.arima_model import ARIMA
model = sm.tsa.ARIMA(ts, order=(5, 1, 2))
model = model.fit()
results_ARIMA=model.predict(typ='levels')
concatenated = pd.concat([ts, results_ARIMA], axis=1, keys=['original', 'predicted'])
concatenated.head(10)
original predicted
login_time
1970-01-01 20:00:00 2 NaN
1970-01-01 20:15:00 6 2.000186
1970-01-01 20:30:00 9 4.552971
1970-01-01 20:45:00 7 7.118973
1970-01-01 21:00:00 1 7.099769
1970-01-01 21:15:00 4 3.624975
1970-01-01 21:30:00 0 3.867454
1970-01-01 21:45:00 4 1.618120
1970-01-01 22:00:00 9 2.997275
1970-01-01 22:15:00 8 6.300015
In the model you specify (5, 1, 2), you set d = 1. This means that you are differencing the data by 1, or in other words, performing a shift of your entire range of time-related observations so as to minimize the residuals of the fitted model.
Sometimes, setting d to 1 will result in a ACF / PACF plot with fewer and / or less dramatic spikes (i.e. less extreme residuals). In such cases, if you use the model you have fitted to predict future values, your predictions will deviate less dramatically from the observations you have if you apply differencing.
Differencing is accomplished through Y(differenced) = Y(t) - Y(t-d), where Y(t) refers to observed value Y at timeindex t, and d refers to the order of differencing you apply. When you use differencing, your entire range of observations basically shifts to the right. This means you lose some data at the left edge of your time series. How many time points you lose depends on the order of differencing d you use. This is where your observed shift comes from.
This page may offer a more elaborate explanation (make sure to click around a bit and explore the other pages on there if you want a treatment of the whole process of fitting an ARIMA model).
Hope this helps (or at least puts your mind at ease about the shift)!
Bests,
Evert