可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Goal

I am trying to build a function that draws a specific number of random numbers from an an "incomplete uniform distribution".

What do I call an incomplete uniform distribution?

I call an incomplete uniform distribution a probability distribution where each value of X within a series of boundaries have an equal probability to be picked. In other words it is a uniform distribution with holes (where the probability is zero) as represented below

x = list(12:25, 34:54, 67:90, 93:115)
y = 1/sum(25-12, 54-34, 90-67, 115-93)
plot(y=rep(y, length(unlist(x))), x=unlist(x), type="n", ylab="Probability", xlab="X")
for (xi in x)
{
    points(xi,rep(y, length(xi)), type="l", lwd=4)
}

Ugly Solution

Here is a slow and ugly solution

IncompleteUnif = function(n,b)
{
    #################
    # "n" is the desired number of random numbers
    # "b" is a list describing the boundaries within which a random number can possibly be drawn.
    #################
    r = c() # Series of random numbers to return
    for (ni in n)
    {
        while (length(r) < n) # loop will continue until we have the "n" random numbers we need
        {
            ub = unlist(b)
            x = runif(1,min(ub), max(ub)) # one random number taken over the whole range
            for (bi in b) # The following loop test if the random number is withinn the ranges specified by "b"
            {
                if (min(bi) < x & max(bi) > x) # if found in one range then just add "x" to "r" and break
                {
                    r = append(r,x)
                    break
                }
            }
        }
    }
    return (r)
}

b = list(c(5,94),c(100,198),c(220,292), c(300,350))
set.seed(12)
IncompleteUnif(10,b)
[1]  28.929516 287.132444 330.204498  63.425103  16.693990  66.680826 226.374551  12.892821   7.872065 140.480533

回答1:

Slightly convoluted version of the @Gregor's solution.

mix_unif <- function(n, b){
  x <- c()
  ns <- rmultinom(1, n, sapply(b, diff))
  for (i in seq_along(ns)) {
    x <- c(x, runif(ns[i], b[[i]][1], b[[i]][2]))
  }
  x
}

microbenchmark(mix_unif(1e5, b),
               rmixunif(1e5, b),
               IncompleteUnif(1e5, b), 
               unit="relative")
Unit: relative
                     expr      min       lq     mean   median       uq      max neval
       mix_unif(1e+05, b) 1.000000 1.000000 1.000000 1.000000 1.000000  1.00000   100
       rmixunif(1e+05, b) 3.123515 3.235961 3.750369 3.496843 3.462529 15.73449   100
 IncompleteUnif(1e+05, b) 6.806916 7.247425 6.926282 7.188556 7.093928 18.20041   100

回答2:

Your incomplete uniform distribution can be represented as a mixture of four ordinary uniform distributions, with mixing weight for each segment proportional to its length (i.e., so that the longer the segment, the more weight it has).

To sample from such a distribution, first choose a segment (taking the weights into account). Then from the chosen segment, choose an element.

回答3:

I believe this works, using the algorithm suggested by Robert Dodier:

rmixunif = function(n, b) {
    subdists = sample(seq_along(b), size = n, replace = T, prob = sapply(b, diff))
    subdists_n = tabulate(subdists)
    draw = numeric(n)
    for (i in unique(subdists)) {
        draw[subdists == i] = runif(subdists_n[i], min = b[[i]][1], max = b[[i]][2])
    }
    return(draw)
}

rmixunif(10, b = list(c(5,94),c(100,198),c(220,292), c(300,350)))
#  [1]  64.85989  85.33292 235.39607 233.40133 240.28686  67.21626 237.60248  11.80377 151.65365 306.44473

I like Sam Dickson's histogram visual check, here's my version:

x <- rmixunif(10000,list(c(0,1),c(2.5,3),c(6,10)))
hist(x,breaks=20)

It could be fancied up with some input-checking (and maybe an mapply like suggested in the comments), but I'll leave that to others.

Thanks to alexis_iaz for the tabulate() suggestion!

回答4:

Another solution is to transform the output. The idea is to sample from a random uniform distribution, then apply a conditional transformation so that the numbers fall only in the selected ranges:

IncompleteUnif = function(n,b) {
  widths <- cumsum(sapply(b,diff))
  x <- runif(n,0,tail(widths,1))
  out <- x
  out[x<=widths[1]] <- x[x<=widths[1]] + b[[1]][1]
  for(i in 2:length(b)) {
    out[widths[i-1]<x & x<=widths[i]] <- x[widths[i-1]<x & x<=widths[i]] - widths[i-1] + b[[i]][1]
  }
  return(out)
}

x <- IncompleteUnif(10000,list(c(0,1),c(2.5,3),c(6,10)))
hist(x,breaks=20)

回答5:

I'm a few years late to the party, but seeing that there wasn't a solution without an explicit loop yet, here's one such implementation (following @RobertDodier's approach):

rmunif <- function(n, b) {
  runifb <- function(n, b) runif(n, b[1], b[2])
  ns <- rmultinom(1, n, vapply(b, diff, 1))
  unlist(Map(runifb, ns, b), use.names = FALSE)
}

hist(rmunif(1e5, list(0:1, c(5, 8), 9:10)))

library(microbenchmark)

set.seed(2018)
n <- 1e5

microbenchmark(
  rmunif(n, b),
  mix_unif(n, b),
  rmixunif(n, b),
  IncompleteUnif(n, b),
  unit = "relative"
) -> mb

print(mb, signif = 5)
#> Unit: relative
#>                  expr    min     lq   mean median     uq    max neval
#>          rmunif(n, b) 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000   100
#>        mix_unif(n, b) 1.1181 1.1256 1.1281 1.1728 1.1236 1.0476   100
#>        rmixunif(n, b) 2.7822 2.8982 2.9899 2.7850 2.8345 1.3970   100
#>  IncompleteUnif(n, b) 4.4922 4.7089 5.2732 4.5764 8.4317 2.4364   100

Created on 2018-03-11 by the reprex package (v0.2.0).