Pandas with Fixed Effects

2020-07-25 23:06发布

问题:

I'm using Pandas on Python 2.7. I have data with the following columns: State, Year, UnempRate, Wage

I'm teaching a course on how to use Python for research. As the culmination of our project, I want to run a regression of UnempRate on Wage controlling for State and Year fixed effects.

I can do this with creation of dummies for states and year and then:

ols(y=df['UnempRate'],x=df[FullDummyList])

Is there an easier way to do this? I was trying to use the PanelOLS method mentioned here: Fixed effect in Pandas or Statsmodels

But I can't seem to get the syntax right, or find more documentation on it.

Thanks!

回答1:

The simplest way to create the dummy variables for the fixed effects is using patsy, or using it via the formula interface to the models in statsmodels.

Statsmodels.OLS, as well as GLM and the discrete models, also have an option to calculate cluster or panel robust (sandwich) covariance matrices for the parameter estimates. Since release 0.6 this can be specified by a cov_type option in the fit method.

statsmodels has currently no panel models that could take correlation across observations into account, however GEE allows one-way cluster correlation in static panel or longitudinal models.

I don't know the details about the panel estimation in pandas, but it's not maintained and will eventually be moved to or replaced by statsmodels.