random unit vector in multi-dimensional space

I'm working on a data mining algorithm where i want to pick a random direction from a particular point in the feature space.

If I pick a random number for each of the n dimensions from [-1,1] and then normalize the vector to a length of 1 will I get an even distribution across all possible directions?

I'm speaking only theoretically here since computer generated random numbers are not actually random.

标签： random distribution data-mining computational-geometry uniform

5条回答

smile是对你的礼貌

2楼-- · 2019-02-08 07:50

I had the exact same question when also developing a ML algorithm.
I got to the same conclusion as Jim Lewis after drawing samples for the 2-d case and plotting the resulting distribution of the angle.

Furthermore, if you try to derive the density distribution for the direction in 2d when you draw at random from [-1,1] for the x- and y-axis ,you will see that:

f_X(x) = 1/(4*cos²(x)) if 0 < x < 45⁰
and
f_X(x) = 1/(4*sin²(x)) if x > 45⁰

where x is the angle, and f_X is the probability density distribution.

I have written about this here: https://aerodatablog.wordpress.com/2018/01/14/random-hyperplanes/

0人赞添加讨论(0) 举报

手持菜刀，她持情操

3楼-- · 2019-02-08 07:52

One simple trick is to select each dimension from a gaussian distribution, then normalize:

from random import gauss

def make_rand_vector(dims):
    vec = [gauss(0, 1) for i in range(dims)]
    mag = sum(x**2 for x in vec) ** .5
    return [x/mag for x in vec]

For example, if you want a 7-dimensional random vector, select 7 random values (from a Gaussian distribution with mean 0 and standard deviation 1). Then, compute the magnitude of the resulting vector using the Pythagorean formula (square each value, add the squares, and take the square root of the result). Finally, divide each value by the magnitude to obtain a normalized random vector.

If your number of dimensions is large then this has the strong benefit of always working immediately, while generating random vectors until you find one which happens to have magnitude less than one will cause your computer to simply hang at more than a dozen dimensions or so, because the probability of any of them qualifying becomes vanishingly small.

0人赞添加讨论(0) 举报

做个烂人

4楼-- · 2019-02-08 07:52

There is a boost implementation of the algorithm that samples from normal distributions: random::uniform_on_sphere

0人赞添加讨论(0) 举报

▲ chillily

5楼-- · 2019-02-08 07:56

You will not get a uniformly distributed ensemble of angles with the algorithm you described. The angles will be biased toward the corners of your n-dimensional hypercube.

This can be fixed by eliminating any points with distance greater than 1 from the origin. Then you're dealing with a spherical rather than a cubical (n-dimensional) volume, and your set of angles should then be uniformly distributed over the sample space.

Pseudocode:

Let n be the number of dimensions, K the desired number of vectors:

vec_count=0
while vec_count < K
   generate n uniformly distributed values a[0..n-1] over [-1, 1]
   r_squared = sum over i=0,n-1 of a[i]^2
   if 0 < r_squared <= 1.0
      b[i] = a[i]/sqrt(r_squared)  ; normalize to length of 1
      add vector b[0..n-1] to output list
      vec_count = vec_count + 1
   else
      reject this sample
end while

0人赞添加讨论(0) 举报

别忘想泡老子

6楼-- · 2019-02-08 08:07

#define SCL1 (M_SQRT2/2)
#define SCL2 (M_SQRT2*2)

// unitrand in [-1,1].
double u = SCL1 * unitrand();
double v = SCL1 * unitrand();
double w = SCL2 * sqrt(1.0 - u*u - v*v);

double x = w * u;
double y = w * v;
double z = 1.0 - 2.0 * (u*u + v*v);

0人赞添加讨论(0) 举报

random unit vector in multi-dimensional space

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间