wls_prediction_std returns standard deviation and confidence interval of my fitted model data. I would need to know the way the confidence intervals are calculated from the covariance matrix. (I already tried to figure it out by looking at the source code but wasn't able to) I was hoping some of you guys could help me out by writing out the mathematical expression behind wls_prediction_std.
问题:
回答1:
There should be a variation on this in any textbook, without the weights.
For OLS, Greene (5th edition, which I used) has
se = s^2 (1 + x (X'X)^{-1} x')
where s^2 is the estimate of the residual variance, x
is vector or explanatory variables for which we want to predict and X
are the explanatory variables used in the estimation.
This is the standard error for an observation, the second part alone is the standard error for the predicted mean y_predicted = x beta_estimated
.
wls_prediction_std
uses the variance of the parameter estimate directly.
Assuming x is fixed, then y_predicted is just a linear transformation of the random variable beta_estimated
, so the variance of y_predicted
is just
x Cov(beta_estimated) x'
To this we still need to add the estimate of the error variance.
As far as I remember, there are estimates that have better small sample properties.
I added the weights, but never managed to verify them, so the function has remained in the sandbox for years. (Stata doesn't return prediction errors with weights.)
Aside:
Using the covariance of the parameter estimate should also be correct if we use a sandwich robust covariance estimator, while Greene's formula above is only correct if we don't have any misspecified heteroscedasticity.
What wls_prediction_std
doesn't take into account is that, if we have a model for the heteroscedasticity, then the error variance could also depend on the explanatory variables, i.e. on x.