generate random integers between two values with a

2019-01-23 20:22发布

I have the following four number sets:

A=[1,207];
B=[208,386];
C=[387,486];
D=[487,586].

I need to generate 20000 random numbers between 1 and 586 in which the probability that the generated number belongs to A is 1/2 and to B,C,D is 1/6.

in which way I can do this using R?

标签: r random
2条回答
走好不送
2楼-- · 2019-01-23 21:00

You can directly use sample, more specifcally the probs argument. Just divide the probability over all the 586 numbers. Category A get's 0.5/207 weight each, etc.

A <- 1:207
B <- 208:386
C <- 387:486
D <- 487:586
L <- sapply(list(A, B, C, D), length)

x <- sample(c(A, B, C, D),
            size = 20000,
            prob = rep(c(1/2, 1/6, 1/6, 1/6) / L, L),
            replace = TRUE)
查看更多
爱情/是我丢掉的垃圾
3楼-- · 2019-01-23 21:23

I would say use the Roulette selection method. I will try to give a brief explanation here. Take a line of say length 1 unit. Now break this in proportion of the probability values. So in our case, first piece will be of 1.2 length and next three pieces will be of 1/6 length. Now sample a number between 0,1 from uniform distribution. As all the number have same probability of occurring, a sampled number belonging to a piece will be equal to length of the piece. Hence which ever piece the number belongs too, sample from that vector. (I will give you the R code below you can run it for a huge number to check if what I am saying is true. I might not be doing a good job of explaining it here.)

It is called Roulette selection because another analogy for the same situation can be, take a circle and split it into sectors where the angle of each sector is proportional to the probability values. Now sample a number again from uniform distribution and see which sector it falls in and sample from that vector with the same probability

A <- 1:207
B <- 208:386
C <- 387:486
D <- 487:586

cumList <- list(A,B,C,D)

probVec <- c(1/2,1/6,1/6,1/6)

cumProbVec <- cumsum(probVec)

ret <- NULL

for( i in 1:20000){

  rand <- runif(1)

  whichVec <- which(rand < cumProbVec)[1] 

  ret <- c(ret,sample(cumList[[whichVec]],1))

}

#Testing the results

length(which(ret %in% A)) # Almost 1/2*20000 of the values

length(which(ret %in% B)) # Almost 1/6*20000 of the values

length(which(ret %in% C)) # Almost 1/6*20000 of the values

length(which(ret %in% D)) # Almost 1/6*20000 of the values
查看更多
登录 后发表回答