Statsmodels Python - Weighted GLM

2019-09-19 21:04发布

问题:

I am currently working with significantly imbalanced data using the statsmodel package GLM (Or the separate logit function if need be). Thus far I have not found a way to implement instance weighting in these methods, however I heard that the current dev release of 0.7 may have this functionality.

1) Is there a way to implement sample weighting in the current stable release 2) If not has the current 0.7-dev release implemented this feature yet?

While I know I can manually over/under sample the data I like the ability to weight classes as under sampling creates a situation where you intentionally lose training data, and over sampling may not be truly representative of the population.

Even if it does not the functionality to balance the classes internally SK-learn's balance class function can be used in conjunction with the instance weighting to achieve the same functionality (http://jaquesgrobler.github.io/online-sklearn-build/modules/generated/sklearn.preprocessing.balance_weights.html)

Thank you