Generating multidimensional data

2020-02-08 15:58发布

Does R have a package for generating random numbers in multi-dimensional space? For example, suppose I want to generate 1000 points inside a cuboid or a sphere.

5条回答
Anthone
2楼-- · 2020-02-08 16:25

A couple of years ago, I made a package called geozoo. It is available on CRAN.

install.packages("geozoo")
library(geozoo)

It has many different functions to produce objects in N-dimensions.

p = 4
n = 1000

# Cube with points on it's face.  
# A 3D version would be a box with solid walls and a hollow interior.
cube.face(p)

# Hollow sphere
sphere.hollow(p, n)


# Solid cube
cube.solid.random(p, n)
cube.solid.grid(p, 10) # evenly spaced points

# Solid Sphere
sphere.solid.random(p, n)
sphere.solid.grid(p, 10) # evenly spaced points

One of my favorite ones to watch animate is a cube with points along its edges, because it was one of the first objects that I made. It also gives you a sense of distance between vertices.

# Cube with points along it's edges.  
cube.dotline(4)

Also, check out the website: http://streaming.stat.iastate.edu/~dicook/geometric-data/. It contains pictures and downloadable data sets.

Hope it meets your needs!

查看更多
够拽才男人
3楼-- · 2020-02-08 16:33

I have some functions for hypercube and n-sphere selection that generate dataframes with cartesian coordinates and guarantee a uniform distribution through the hypercube or n-sphere for an arbitrary amount of dimensions :

GenerateCubiclePoints <- function(nrPoints,nrDim,center=rep(0,nrDim),l=1){

    x <-  matrix(runif(nrPoints*nrDim,-1,1),ncol=nrDim)
    x <-  as.data.frame(
            t(apply(x*(l/2),1,'+',center))
          )
    names(x) <- make.names(seq_len(nrDim))
    x
}

is in a cube/hypercube of nrDim dimensions with a center and l the length of one side.

For an n-sphere with nrDim dimensions, you can do something similar, where r is the radius :

GenerateSpherePoints <- function(nrPoints,nrDim,center=rep(0,nrDim),r=1){
    #generate the polar coordinates!
    x <-  matrix(runif(nrPoints*nrDim,-pi,pi),ncol=nrDim)
    x[,nrDim] <- x[,nrDim]/2
    #recalculate them to cartesians
    sin.x <- sin(x)
    cos.x <- cos(x)
    cos.x[,nrDim] <- 1  # see the formula for n.spheres

    y <- sapply(1:nrDim, function(i){
        if(i==1){
          cos.x[,1]
        } else {
          cos.x[,i]*apply(sin.x[,1:(i-1),drop=F],1,prod)
        }
    })*sqrt(runif(nrPoints,0,r^2))

    y <-  as.data.frame(
            t(apply(y,1,'+',center))
          )

    names(y) <- make.names(seq_len(nrDim))
    y
}

in 2 dimensions, these give :

enter image description here

From code :

 T1 <- GenerateCubiclePoints(10000,2,c(4,3),5)
 T2 <- GenerateSpherePoints(10000,2,c(-5,3),2)
 op <- par(mfrow=c(1,2))
 plot(T1)
 plot(T2)
 par(op)
查看更多
闹够了就滚
4楼-- · 2020-02-08 16:37

Here is one way to do it. Say we hope to generate a bunch of 3d points of the form y = (y_1, y_2, y_3)

  1. Sample X from multivariate Gaussian with mean zero and covariance matrix R.

       (x_1, x_2, x_3) ~ Multivariate_Gaussian(u = [0,0,0], R = [[r_11, r_12, r_13],r_21, r_22, r_23], [r_31, r_32, r_33]]
    

    You can find a function which generates Multivariate Gaussian samples in an R package.

  2. Take the Gaussian cdf of each covariate (phi(x_1) , phi(x_2), phi(x_3)). In this case, phi is the Gaussian cdf of our variables. Ie phi(x_1) = Pr[x <= x_1] By the probability integral transform, these (phi(x_1) , phi(x_2), phi(x_3)) = (u_1, u_2, u_3), will each be uniformly distrubted on [0,1].

  3. Then, take the inverse cdf of each uniformly distributed marginal. In other words take the inverse cdf of u_1, u_2, u_3:

    F^{-1}(u_1), F^{-2}(u_2), F^{-3}(u_3) = (y_1, y_2, y_3), where F is the marginal cdf of the distrubution you are trying to sample from.

查看更多
闹够了就滚
5楼-- · 2020-02-08 16:39

Cuboid:

df <- data.frame(
    x = runif(1000),
    y = runif(1000),
    z = runif(1000)
)

head(df)

          x           y         z
1 0.7522104 0.579833314 0.7878651
2 0.2846864 0.520284731 0.8435828
3 0.2240340 0.001686003 0.2143208
4 0.4933712 0.250840233 0.4618258
5 0.6749785 0.298335804 0.4494820
6 0.7089414 0.141114804 0.3772317

Sphere:

df <- data.frame(
    radius = runif(1000),
    inclination = 2*pi*runif(1000),
    azimuth = 2*pi*runif(1000)
)


head(df)

     radius inclination  azimuth
1 0.1233281    5.363530 1.747377
2 0.1872865    5.309806 4.933985
3 0.2371039    5.029894 6.160549
4 0.2438854    2.962975 2.862862
5 0.5300013    3.340892 1.647043
6 0.6972793    4.777056 2.381325

Note: edited to include code for sphere

查看更多
甜甜的少女心
6楼-- · 2020-02-08 16:42

Also check out the copula package. This will generate data within a cube/hypercube with uniform margins, but with correlation structures that you set. The generated variables can then be transformed to represent other shapes, but still with relations other than independent.

If you want more complex shapes but are happy with uniform and idependent within the shape then you can just do rejection sampling: generate data within a cube that contains your shape, then test if the points are within your shape, reject them if not, then keep doing this until there are enough points.

查看更多
登录 后发表回答