Generate non-negative (or positive) random integer

2019-06-08 06:09发布

问题:

I would like to randomly assign positive integers to G groups, such that they sum up to V.

For example, if G = 3 and V = 21, valid results may be (7, 7, 7), (10, 6, 5), etc.

Is there a straightforward way to do this?


Editor's notice (from 李哲源):

If values are not restricted to integers, the problem is simple and has been addressed in Choosing n numbers with fixed sum.

For integers, there is a previous Q & A: Generate N random integers that sum to M in R but it appears more complicated and is hard to follow. The loop based solution over there is also not satisfying.

回答1:

non-negative integers

Let n be sample size:

x <- rmultinom(n, V, rep.int(1 / G, G))

is a G x n matrix, where each column is a multinomial sample that sums up to V.

By passing rep.int(1 / G, G) to argument prob I assume that each group has equal probability of "success".


positive integers

As Gregor mentions, a multinomial sample can contain 0. If such samples are undesired, they should be rejected. As a result, we sample from a truncated multinomial distribution.

In How to generate target number of samples from a distribution under a rejection criterion I suggested an "over-sampling" approach to achieve "vectorization" for a truncated sampling. Simply put, Knowing the acceptance probability we can estimate the expected number of trials M to see the first "success" (non-zero). We first sample say 1.25 * M samples, then there will be at least one "success" in these samples. We randomly return one as the output.

The following function implements this idea to generate truncated multinomial samples without 0.

positive_rmultinom <- function (n, V, prob) {
  ## input validation
  G <- length(prob)
  if (G > V) stop("'G > V' causes 0 in a sample for sure!")
  if (any(prob < 0)) stop("'prob' can not contain negative values!")
  ## normalization
  sum_prob <- sum(prob)
  if (sum_prob != 1) prob <- prob / sum_prob
  ## minimal probability
  min_prob <- min(prob)
  ## expected number of trials to get a "success" on the group with min_prob
  M <- round(1.25 * 1 / min_prob)
  ## sampling
  N <- n * M
  x <- rmultinom(N, V, prob)
  keep <- which(colSums(x == 0) == 0)
  x[, sample(keep, n)]
  }

Now let's try

V <- 76
prob <- c(53, 13, 9, 1)

Directly using rmultinom to draw samples can occasionally result in ones with 0:

## number of samples that contain 0 in 1000 trials
sum(colSums(rmultinom(1000, V, prob) == 0) > 0)
#[1] 355   ## or some other value greater than 0

But there is no such issue by using positive_rmultinom:

## number of samples that contain 0 in 1000 trials
sum(colSums(positive_rmultinom(1000, V, prob) == 0) > 0)
#[1] 0


回答2:

Probably a less expensive way, but this seems to work.

G <- 3
V <- 21
m <- data.frame(matrix(rep(1:V,G),V,G))
tmp <- expand.grid(m) # all possibilities
out <- tmp[which(rowSums(tmp) == V),] # pluck those that sum to 'V'
out[sample(1:nrow(out),1),] # randomly select a column

Not sure how to do with runif



标签: r random