stan number of effective sample size

2019-08-10 20:09发布

问题:

I reproduced the results of a hierarchical model using the rethinking package with just rstan() and I am just curious why n_eff is not closer.

Here is the model with random intercepts for 2 groups (intercept_x2) using the rethinking package:

Code:

response = c(rnorm(500,0,1),rnorm(500,200,10))
predicotr1_continuous = rnorm(1000)
predictor2_categorical = factor(c(rep("A",500),rep("B",500) ))
data = data.frame(y = response, x1 = predicotr1_continuous, x2 = predictor2_categorical)
head(data)
library(rethinking)
m22 <- map2stan( 
  alist(
    y ~ dnorm( mu , sigma ) ,
    mu  <- intercept +  intercept_x2[x2] + beta*x1  ,
    intercept ~ dnorm(0,10),
    intercept_x2[x2] ~ dnorm(0, sigma_2),
    beta ~ dnorm(0,10),
    sigma ~ dnorm(0, 10),
    sigma_2 ~ dnorm(0,10)
  ) ,
  data=data , chains=1 , iter=5000 , warmup=500 )

precis(m22, depth = 2)

                  Mean StdDev lower 0.89 upper 0.89 n_eff Rhat
intercept         9.96   9.59      -5.14      25.84  1368    1
intercept_x2[1]  -9.94   9.59     -25.55       5.43  1371    1
intercept_x2[2] 189.68   9.59     173.28     204.26  1368    1
beta              0.06   0.22      -0.27       0.42  3458    1
sigma             6.94   0.16       6.70       7.20  2927    1
sigma_2          43.16   5.01      35.33      51.19  2757    1

Now here is the same model in rstan():

  # create a numeric vector to indicate the categorical groups
data$GROUP_ID  = match(  data$x2, levels( data$x2   )  ) 
library(rstan)
standat <- list(
  N = nrow(data),
  y = data$y,
  x1 = data$x1,
  GROUP_ID = data$GROUP_ID,
  nGROUPS = 2
)


stanmodelcode = '
data {
int<lower=1> N; 
int nGROUPS;
real y[N];                                 
real x1[N];   
int<lower=1, upper=nGROUPS> GROUP_ID[N];
}


transformed data{

}

parameters {
real intercept;  
vector[nGROUPS] intercept_x2;
real beta;                   
real<lower=0> sigma;    
real<lower=0>  sigma_2;
}

transformed parameters {  // none needed
}

model {
real mu;

// priors
intercept~ normal(0,10);
intercept_x2 ~ normal(0,sigma_2);
beta ~ normal(0,10);
sigma ~ normal(0,10);
sigma_2 ~ normal(0,10);

// likelihood

for(i in 1:N){
mu = intercept + intercept_x2[ GROUP_ID[i] ] +  beta*x1[i];
y[i] ~ normal(mu, sigma);
}

}
'

fit22 = stan(model_code=stanmodelcode, data=standat, iter=5000, warmup=500, chains = 1)
fit22


    Inference for Stan model: b212ebc67c08c77926c59693aa719288.
    1 chains, each with iter=5000; warmup=500; thin=1; 
    post-warmup draws per chain=4500, total post-warmup draws=4500.

                        mean se_mean   sd     2.5%      25%      50%      75%    97.5% n_eff Rhat
    intercept          10.14    0.30 9.72    -8.42     3.56    10.21    16.71    29.19  1060    1
    intercept_x2[1]   -10.12    0.30 9.73   -29.09   -16.70   -10.25    -3.50     8.36  1059    1
    intercept_x2[2]   189.50    0.30 9.72   170.40   182.98   189.42   196.09   208.05  1063    1
    beta                0.05    0.00 0.21    -0.37    -0.10     0.05     0.20     0.47  3114    1
    sigma               6.94    0.00 0.15     6.65     6.84     6.94     7.05     7.25  3432    1
    sigma_2            43.14    0.09 4.88    34.38    39.71    42.84    46.36    53.26  3248    1
    lp__            -2459.75    0.05 1.71 -2463.99 -2460.68 -2459.45 -2458.49 -2457.40  1334    1

    Samples were drawn using NUTS(diag_e) at Thu Aug 31 15:53:09 2017.
    For each parameter, n_eff is a crude measure of effective sample size,
    and Rhat is the potential scale reduction factor on split chains (at 
    convergence, Rhat=1).

My Questions:

  • the n_eff is larger using rethinking(). There is simulation differences but do you think something else is going on here?
  • Besides the n_eff being different the percentiles of the posterior distributions are different. I was thinking rethinking() and rstan() should return similar results with 5000 iterations since rethinking is just calling rstan. Are differences like that normal or something different between the 2 implementations?
  • I created data$GROUP_ID to indicate the categorical groupings. Is this the correct way to incorporate categorical variables into a hierarchical model in rstan()? I have 2 groups and if I had 50 groups I use the same data$GROUP_ID vector but is that the standard way?

Thank you.