Fixed Effects plm package R - multiple observation

2019-07-14 03:11发布

I'm working on a state and year fixed effects regression, which has 3 observations per state/year combo based on the race for that row (white, black, other) - See link below.
So far, I've been using the base lm function to estimate a fixed effects regression that accounts for all three races. I do this by using state, year and race all as factor variables. I am also running separate regressions for each individual race. The problem is that I would prefer to use the plm package so that i can get the within r-squared for the model with all races, however it is giving me errors.

Edit: I included a picture of my data here the data is a balanced panel, there are 34 states, 12 years (2003-2014) and 3 races for each state/year combo so a total of 1244 observations.

Here is the code I'm using to run the plm regression:

#plm regression
plm.reg <- plm(drugcrime_ar ~ decrim_dummy + median_income + factor(race),
               data = my.data, index=c("st_name","year"), model = "within",
               effect = "twoways")

The errors I get in return:

Error in pdim.default(index[[1]], index[[2]]): 
   duplicate couples (id-time) 
In addition: Warning messages: 
1: In pdata.frame(data, index) :
   duplicate couples (id-time) in resulting pdata.frame
   to find out which, use e.g. table(index(your_pdataframe), useNA = "ifany"
2: In is.pbalanced.default(index[[1]], index[[2]]) :
   duplicate couples (id-time)
 3: In is.pbalanced.default(index[[1]], index[[2]]) :
   duplicate couples (id-time)  ` 

Is there a workaround for this or am I out of luck?

标签: r plm economics
1条回答
时光不老,我们不散
2楼-- · 2019-07-14 04:01

The plm function needs just one pair of id/time. For each id you supplied you have more than one year.

If each st_name and race pairs form an "individual" (or whatever the name you give to this dimension of the panel), then you could do:

library(dplyr)

my.data$id <- group_indices(my.data, st_name, race)    
#which would be the same as my.data <- my.data %>% mutate(id = group_indices(st_name, race)), if this function supported mutate. 

plm.reg <- plm(drugcrime_ar ~ decrim_dummy + median_income + factor(race),
           data = my.data, index=c("id","year"), model = "within",
           effect = "twoways")

See, however, that in this situation you are not using a kind of nested panel structure as @Helix123 suggested. You are only redefining the first dimension of the panel.

查看更多
登录 后发表回答