Using sklearn RFE with an estimator from another p

2019-07-21 16:13发布

问题:

Is it possible to use sklearn Recursive Feature Elimination(RFE) with an estimator from another package?

Specifically, I want to use GLM from statsmodels package and wrap it in sklearn RFE?

If yes, could you please give some examples?

回答1:

Yes, it is possible. You just need to create a class that inherit sklearn.base.BaseEstimator, make sure it has fit & predict methods, and make sure its fit method expose feature importance through either coef_ or feature_importances_ attribute. Here is a simplified example of a class:

import numpy as np
from sklearn.datasets import make_classification
from sklearn.base import BaseEstimator
from sklearn.linear_model import LogisticRegression
from sklearn.feature_selection import RFE

class MyEstimator(BaseEstimator):
  def __init__(self):
    self.model = LogisticRegression()

  def fit(self, X, y, **kwargs):
    self.model.fit(X, y)
    self.coef_ = self.model.coef_

  def predict(self, X):
    result = self.model.predict(X)    
    return np.array(result)

if __name__ == '__main__':
  X, y = make_classification(n_features=10, n_redundant=0, n_informative=7, n_clusters_per_class=1)
  estimator = MyEstimator()
  selector = RFE(estimator, 5, step=1)
  selector = selector.fit(X, y)
  print(selector.support_)
  print(selector.ranking_)