Model Prediction for pooled regression model in pa

I'm trying to produce a predictive model where i performed multiple pooled regressions in each year (based on previous years) and thus allow coefficients to vary across time. (This might not make sense in the sample data provided, but it is done in practice for my sample).

Here is what I came up so far: I adjusted my code to a reproducible sample from the plm package:

The data is structured in the following way (panel) with firm, year indexed.

> head(Grunfeld)
  firm year   inv  value capital
1    1 1935 317.6 3078.5     2.8
2    1 1936 391.8 4661.7    52.6
3    1 1937 410.6 5387.1   156.9
4    1 1938 257.7 2792.2   209.2
5    1 1939 330.8 4313.2   203.4
6    1 1940 461.2 4643.9   207.2

and here is my code:

library(plm)
data("Grunfeld", package="plm")

# Store each subset regression in myregression
myregression <- list()
count <- 1

## pooled regression in each year t, 
## with subset data of the previous six years (t-5) 

for(t in 1940:1950){  
  myregression[[count]] <- plm(inv ~ value + capital, 
                              subset(Grunfeld, year<=t & year>=t-5),
                              index=c("firm","year"))
# Name each regression based on the year range included in the data subset
names(myregression)[[count]] = paste0("Year_",t)
count <- count+1
}


## Prediction
#######################
# Alternative 1: Loop

Forecast<-list()
count<-1
for(t in 1940:1950){
  Forecast[[count]]<-predict(myregression[[count]], subset(Grunfeld, year==t))
  ## Name each Prediction based on the year t:
 names(Forecast)[[count]] = paste0("Year_",t)
 count <- count+1
}

Unfortunately my code does not work and i get the following error:

Error in crossprod(beta, t(X)) : non-conformable arguments

Ideally i would like to store my Predictions/Forecasts in $Grunfeld$Forecast in the same structure as the original Grunfeld data. However I experienced a lot of difficulties working with Lists and often failed to correctly address them and store the results in a vector next to the original data. This is crucial as in my own sample, there is a lot of missing data (NA's) and i can only use the predict function on a limited subset. How do you arrange the data in a desired way?

And is this the right approach to obtain conditional forecasts (on the year)with varying slopes and storing them in the same manner as the original data or are there more efficient ways i'm unaware of?

标签： r linear-regression prediction data-manipulation panel-data

1条回答

别忘想泡老子

2楼-- · 2019-08-09 01:59

Note that you are not estimating a pooled regression. plm, by default, estimates a within model. A quick summary of the first regression reveals this. See e.g. summary(myregression[[1]], whose first lines read:

Oneway (individual) effect Within Model

Call:
plm(formula = inv ~ value + capital, data = subset(Grunfeld, 
    year <= t & year >= t - 5), index = c("firm", "year"))

...

Since you talk about a pooled regression, try the following code. I took the liberty to make it a bit shorter:

for(t in 1940:1950){  
  myregression[[as.character(t)]] <- plm(inv ~ value + capital, 
                                         subset(Grunfeld, year<=t & year>=t-5),
                                         index=c("firm","year") , model="pooling")
}
for(t in 1940:1950){
  Forecast[[as.character(t)]]<-predict(myregression[[as.character(t)]], 
                                       subset(Grunfeld, year==t))
}

This gives you your predicted values without error messages.

I can't comment on your last question about whether or not this is the right statistical approach, but I hope that the R-related question is settled.

To respond to your comment, try

Grunfeld$forc <- NA

for(t in 1940:1950){
  Grunfeld[which(Grunfeld$year==as.character(t)), "forc"] <-
               predict(myregression[[as.character(t)]], subset(Grunfeld, year==t))
}

0人赞添加讨论(0) 举报

Model Prediction for pooled regression model in pa

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间