How to fix Statsmodel warning: “Maximum no. of ite

2020-07-10 10:25发布

问题:

I am using Anaconda and I am trying logistic regression. After loading training data set and performed the regression. Then I got the following warning message.

train_cols = data.columns[1:]
logit = sm.Logit(data['harmful'], data[train_cols])
result = logit.fit() 
Warning: Maximum number of iterations has been exceeded.
     Current function value: 0.000004
     Iterations: 35
C:\Users\dell\Anaconda\lib\site-packages\statsmodels\base\model.py:466: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals"Check mle_retvals", ConvergenceWarning)

Why do I get this warning and how can I fix this? Thanks!

回答1:

There are two possibilities

1) difficult optimization problem: Usually Logit converges very fast and the default number of iteration is set very low. Adding a larger maxiter keyword in the call to fit or refitting with the previous result as start_params helps in most cases.

2) Since this is Logit, it is possible that there is complete separation, or quasi-complete separation. In this case some parameters might go off to infinity and the optimization stops at some convergence or stopping criterion. Logit detect the simple case of full separation and raises an exception, but there could be partial separation that is not detected. With perfect separation you get perfect predictability for some or all cases, which is useful for prediction but causes problems in estimating and identifying the parameters. More information is for example here https://stats.stackexchange.com/questions/11109/how-to-deal-with-perfect-separation-in-logistic-regression There are also several FAQ issues on the statsmodels github issues for corner cases and problems like this.



回答2:

Check for the levels of all variables. It might be possible that one amongst them would have almost 99% of one category. Hence making it difficult to converge. I resolved it by removing that variable from my dataset.