This question already has an answer here:
-
Fast arbitrary distribution random sampling
4 answers
I have a piecewise quartic distribution with a probability density function:
p(x)= c(x/a)^2 if 0≤x<a;
c((b+a-x)^2/b)^2 if a≤x≤b;
0 otherwise
Suppose c, a, b are known, I am trying to draw 100 random samples from the distribution. How can I do it with numpy/scipy?
One standard way is to find an explicit formula, G = F^-1
for the inverse of the cumulative distribution function. That is doable here (although it will naturally be piecewise defined) and then use G(U)
where U
is uniform on [0,1] to generate your samples.
In this case, I think that I worked out the details, but you will need to check the Calculus/Algebra.
First of all, to streamline things it helps to introduce a couple of new parameters. Let
f(a,b,c,d,x) = c*x**2 #if 0 <= x <= a
and
f(a,b,c,d,x) = d*(x-e)**4 #if a < x <= b
Then your p(x)
is given by
p(x) = f(a,b,c/a**2,c/b**2,a+b)
I integrated f
to find the cumulative distribution and then inverted and got the following:
def Finverse(a,b,c,d,e,x):
if x <= (c*a**3)/3:
return (3*x/c)**(1/3)
else:
return e + ((a-e)**5 - (5*c*a**3)/(3*d))**(1/5)
Assuming this is right, then simply:
def randX(a,b,c):
u = random.random()
return Finverse(a,b,c/a**2,c/b**2,a+b,u)
In this case it was possible to work out an explicit formula. When you can't work out such a formula for the inverse, consider using the Monte Carlo methods described by @lucianopaz
As your function is bounded both in x
and p(x)
, I recommend that you use Monte Carlo rejection sampling. The basic principle is that you draw two uniform random numbers, one representing a candidate x
in the x
space bounds [0,b]
and another representing y
. If y
is lower or equal to the normalized p(x)
, then the sampled x
is returned, if not it continues to the next iteration
import numpy as np
def rejection_sampler(p,xbounds,pmax):
while True:
x = np.random.rand(1)*(xbounds[1]-xbounds[0])+xbounds[0]
y = np.random.rand(1)*pmax
if y<=p(x):
return x
Here, p
should be a callable to your normalized piecewise probability density, xbounds
can be a list or tuple containing the lower and upper bounds, and pmax
the maximum of the probability density in the x interval.