I am using Anaconda and I am trying logistic regression. After loading training data set and performed the regression. Then I got the following warning message.
train_cols = data.columns[1:]
logit = sm.Logit(data['harmful'], data[train_cols])
result = logit.fit()
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000004
Iterations: 35
C:\Users\dell\Anaconda\lib\site-packages\statsmodels\base\model.py:466: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals"Check mle_retvals", ConvergenceWarning)
Why do I get this warning and how can I fix this?
Thanks!
There are two possibilities
1) difficult optimization problem: Usually Logit converges very fast and the default number of iteration is set very low. Adding a larger maxiter
keyword in the call to fit
or refitting with the previous result as start_params
helps in most cases.
2) Since this is Logit, it is possible that there is complete separation, or quasi-complete separation. In this case some parameters might go off to infinity and the optimization stops at some convergence or stopping criterion. Logit detect the simple case of full separation and raises an exception, but there could be partial separation that is not detected. With perfect separation you get perfect predictability for some or all cases, which is useful for prediction but causes problems in estimating and identifying the parameters.
More information is for example here https://stats.stackexchange.com/questions/11109/how-to-deal-with-perfect-separation-in-logistic-regression There are also several FAQ issues on the statsmodels github issues for corner cases and problems like this.
Check for the levels of all variables. It might be possible that one amongst them would have almost 99% of one category. Hence making it difficult to converge.
I resolved it by removing that variable from my dataset.