I'm working on a state and year fixed effects regression, which has 3 observations per state/year combo based on the race for that row (white, black, other) - See link below.
So far, I've been using the base lm function to estimate a fixed effects regression that accounts for all three races. I do this by using state, year and race all as factor variables. I am also running separate regressions for each individual race. The problem is that I would prefer to use the plm package so that i can get the within r-squared for the model with all races, however it is giving me errors.
Edit: I included a picture of my data here the data is a balanced panel, there are 34 states, 12 years (2003-2014) and 3 races for each state/year combo so a total of 1244 observations.
Here is the code I'm using to run the plm regression:
#plm regression
plm.reg <- plm(drugcrime_ar ~ decrim_dummy + median_income + factor(race),
data = my.data, index=c("st_name","year"), model = "within",
effect = "twoways")
The errors I get in return:
Error in pdim.default(index[[1]], index[[2]]):
duplicate couples (id-time)
In addition: Warning messages:
1: In pdata.frame(data, index) :
duplicate couples (id-time) in resulting pdata.frame
to find out which, use e.g. table(index(your_pdataframe), useNA = "ifany"
2: In is.pbalanced.default(index[[1]], index[[2]]) :
duplicate couples (id-time)
3: In is.pbalanced.default(index[[1]], index[[2]]) :
duplicate couples (id-time) `
Is there a workaround for this or am I out of luck?
The
plm
function needs just one pair of id/time. For each id you supplied you have more than one year.If each
st_name
andrace
pairs form an "individual" (or whatever the name you give to this dimension of the panel), then you could do:See, however, that in this situation you are not using a kind of nested panel structure as @Helix123 suggested. You are only redefining the first dimension of the panel.