I have an integral to evaluate
"x^(-0.5)" ; x in [0.01,1]
for which I am using Importance Sampling MC : The theory says that an approximate PDF has to be used to compute the expected value (which will almost surely converge to the mean - value of the integral)
After plotting the given integral, and exponential PDF, based only on the plots, I chose the
rexp
and dexp
to generate the PDF - and my code looks like this -
#Without Importance Sampling
set.seed(1909)
X <- runif(1000,0.01,1)
Y <- X^(-0.5)
c( mean(Y), var(Y) )
#Importance sampling Monte Carlo
w <- function(x) dunif(x, 0.01, 1)/dexp(x,rate=1.5)
f <- function(x) x^(-0.5)
X= rexp(1000,rate=1.5)
Y=w(X)*f(X)
c( mean(Y), var(Y) )
Could someone please confirm if my line of thought is correct? If wrong, how differently am I supposed to approach this? Please elucidate - I have understood the theory but implementation is proving to be problematic for me.
For integrals that are not so simple,
1.) f(x) = [1+sinh(2x)ln(x)]^-1 I chose the normal PDF = g(x) (with mean = 0.5 and SD = 5) as approximate only after observing the plot. I wrote a code similar to the one for it , but it says NaN's produced in case of importance sampling. (this ideally means undefined function but I don't know how to solve this).
2.) f(x,y) = exp(-x^4 - y^4)
How do I choose the g(x,y) for the above function ?
Generally your approach seems to be correct, but you have to be more careful with the domain over which you want to integrate. In your original example, about 20% of values
rexp(1000, 1.5)
are above 1. The functiondexp(x, rate=1.5)
is not a density function on the interval [0,1]. You have to divide bypexp(1, rate=1.5)
. So here is what I would do for the importance sampling example:In your second example the same thing causes the problem. You get negative X and therefore get NA values for log(X). Furthermore, your normal function should be centered at 0.5 with less variance. Here's my approach:
In your second example, I don't really understand what
y
is. Is it just a constant? Then perhaps a Weibull distribution may work.EDIT: Regarding your additional questions in the comments. (1) Any probability density function should integrate to 1. Therefore
dexp(x, rate=1.5)
is not a density function on the interval [0,1], it only integrates topexp(1, rate=1.5)
. However, the functionactually integrates to 1:
That's the rationale of including the probability distribution function. If you have a different interval, e.g. [0.3,8], you have to adjust the function accordingly:
(2) Here I choose the variance so that approximately 95% of the values in
rnorm(1000, .5, .25)
were in the interval from 0 to 1 (having many values outside this interval would certainly increase the variance). However, I am not certain that this is the best choice of distribution function. The selection of the importance function is a problem that I am not very familiar with. You could ask on CrossValidated. Same goes for your next question.