Multinomial/conditional Logit Regression, Why Stat

2019-05-08 16:38发布

问题:

I am trying to reproduce an example of a multinomial logit regression of the mlogit package in R.

data("Fishing", package = "mlogit")
Fish <- mlogit.data(Fishing, varying = c(2:9), shape = "wide", choice = "mode")
#a pure "conditional" model
summary(mlogit(mode ~ price + catch, data = Fish))

To reproduce this example with statsmodel function MNLogit, I export the Fishing data set as a csv file and do the following

import pandas
import statsmodels.api as st
#load data
df = pandas.read_csv("Fishing.csv")
x = df.drop('mode', axis = 1)
y = df['mode']
mdl = st.MNLogit(y, x)
mdl_fit = mdl.fit()

I receive the following error

LinAlgError: Singular matrix

I have tried to figure out how to re organise the originial data set Fishing, as I know that mlogit package reorganise the data before fitting but can't figure how to change that in statsmodel. Any help would be much appreciated.

回答1:

MNLogit in statsmodels implements a different version of multinomial logit. AFAICS, it corresponds to nnet multinom in R https://stats.stackexchange.com/questions/186344/r-interpreting-the-multinom-output-using-the-iris-dataset/188426

In this case, the parameters differ across choices but not the explanatory variables. In the multiple choice CLogit version, or mlogit version in R, the explanatory variables differ across choices but the parameters are choice independent.

The CLogit and other multinomial logit versions are waiting in pull requests for statsmodels, and are currently not available in the main branch.