I'm learning about chi2 for feature selection and came across code like this
However, my understanding of chi2 was that higher scores mean that the feature is more independent (and therefore less useful to the model) and so we would be interested in features with the lowest scores. However, using scikit learns SelectKBest, the selector returns the values with the highest chi2 scores. Is my understanding of using the chi2 test incorrect? Or does the chi2 score in sklearn produce something other than a chi2 statistic?
See code below for what I mean (mostly copied from above link except for the end)
from sklearn.datasets import load_iris
# Load iris data
iris = load_iris()
# Create features and target
X = iris.data
y = iris.target
# Convert to categorical data by converting data to integers
X = X.astype(int)
# Select two features with highest chi-squared statistics
chi2_selector = SelectKBest(chi2, k=2)
chi2_selector.fit(X, y)
# Look at scores returned from the selector for each feature
chi2_scores = pd.DataFrame(list(zip(iris.feature_names, chi2_selector.scores_, chi2_selector.pvalues_)), columns=['ftr', 'score', 'pval'])
chi2_scores
# you can see that the kbest returned from SelectKBest
#+ were the two features with the _highest_ score
kbest = np.asarray(iris.feature_names)[chi2_selector.get_support()]
kbest