I've searched the sklearn docs for TimeSeriesSplit
and the docs for cross-validation but I haven't been able to find a working example.
I'm using sklearn version 0.19.
This is my setup
import xgboost as xgb
from sklearn.model_selection import TimeSeriesSplit
from sklearn.grid_search import GridSearchCV
import numpy as np
X = np.array([[4, 5, 6, 1, 0, 2], [3.1, 3.5, 1.0, 2.1, 8.3, 1.1]]).T
y = np.array([1, 6, 7, 1, 2, 3])
tscv = TimeSeriesSplit(n_splits=2)
for train, test in tscv.split(X):
print(train, test)
gives:
[0 1] [2 3]
[0 1 2 3] [4 5]
If I try:
model = xgb.XGBRegressor()
param_search = {'max_depth' : [3, 5]}
my_cv = TimeSeriesSplit(n_splits=2).split(X)
gsearch = GridSearchCV(estimator=model, cv=my_cv,
param_grid=param_search)
gsearch.fit(X, y)
it gives: TypeError: object of type 'generator' has no len()
I get the problem: GridSearchCV
is trying to call len(cv)
but my_cv
is an iterator without length. However, the docs for GridSearchCV
state I can use a
int, cross-validation generator or an iterable, optional
I tried using TimeSeriesSplit
without the .split(X)
but it still didn't work.
I'm sure I'm overlooking something simple, thanks!!
It turns out the problem was I was using
GridSearchCV
fromsklearn.grid_search
, which is deprecated. ImportingGridSearchCV
fromsklearn.model_selection
resolved the problem:gives: