-->

Issue with Cross Validation

2019-09-18 01:40发布

问题:

I want to use leave one out cross validation. But i am getting below error:

AttributeError                            Traceback (most recent call last)
<ipython-input-19-f15f1e522706> in <module>()
      3 loo = LeaveOneOut(num_of_examples)
      4 #loo.get_n_splits(X_train_std)
----> 5 for train, test in loo.split(X_train_std):
      6     print("%s %s" % (train, test))

AttributeError: 'LeaveOneOut' object has no attribute 'split'

The detailed code is as follows:

from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = 
train_test_split(X, y, test_size=0.3, random_state=0)

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
sc.fit(X_train)
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)

from sklearn.cross_validation import LeaveOneOut
num_of_examples = len(X_train_std)
loo = LeaveOneOut(num_of_examples)
for train, test in loo.split(X_train_std):
print("%s %s" % (train, test))

回答1:

I think that you are using scikit-learn version below 0.18 and maybe referring some tutorials for version 0.18.

In versions prior to 0.18, the LeaveOneOut() constructor has a required parameter n which is not supplied in the above code you posted. Hence the error. You can refer to the documentation of LeaveOneOut for version 0.17 here where its mentioned that:

Parameters: n : int Total number of elements in dataset.

Solution:

  • Update the scikit-learn to version 0.18
  • Initialize the LeaveOneOut as follows:

    loo = LeaveOneOut(size of X_train_std)

Edit:

If you are using the scikit version >=0.18:

from sklearn.model_selection import LeaveOneOut
for train_index, test_index in loo.split(X):
    print("%s %s" % (train_index, test_index))
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]

Else, for versions < 0.18 use the iterations like this (Notice that here loo.split() is not used, loo is used directly):

from sklearn.cross_validation import LeaveOneOut
loo = LeaveOneOut(num_of_examples)
for train_index, test_index in loo:
    print("%s %s" % (train_index, test_index))
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]


回答2:

use

from sklearn.model_selection import train_test_split

rather than cross_validation because cross_validation in changed into model_selction