statsmodel predict start and end indices

2019-06-25 08:28发布

问题:

I am trying to implement the prediction function from statsmodel package

prediction = results.predict(start=1,end=len(test),exog=test)

The dates of the input, test, and the output prediction are inconsistent. I get 1/4/2012 to 7/25/2012 for the former and 4/26/2013 to 11/13/2013 for the latter. Part of the difficulty is that I don't have a completely recurring frequency - I have daily values excluding weekends and holidays. What is the appropriate way to set the indices?

x = psql.frame_query(query,con=db)
x = x.set_index('date')

train = x[0:len(x)-50]
test = x[len(x)-50:len(x)]

arima = tsa.ARIMA(train['A'], exog=train, order = (2,1,1))
results = arima.fit()
prediction = results.predict(start=test.index[0],end=test.index[-1],exog=test)

I get the error

There is no frequency for these dates and date 2013-04-26 00:00:00 is not in dates index. Try giving a date that is in the dates index or use an integer

Here's the first set of data

2013-04-26   -0.9492
2013-04-29    2.2011
...
2013-11-12    0.1178
2013-11-13    2.0449

回答1:

The indices should be any datetime-like values, including pandas' timestamps. If you use a business-day frequency from pandas then this should work, though holidays may present a problem here given that it's not standardized. You may be able to use their custom holiday calendar support though and get what you want.

As I've mentioned in your other questions, without a fully reproducible example, there's not much I can say about what you get given what you put in, though this should work if you give the correct index. If there's no periodic frequency to the dates. E.g., weekends and holidays are excluded without telling the index that, then there's no way to predict what dates you'll want out of sample.