Strange precision issues in R when computing cumul

2019-02-24 23:58发布

问题:

I've been running into some weird problems when using this code:

positions<-c(58256)
occurrencies<-c(30)
frequency<-c(11/5531777)
length<-c(4)

prob<-c(0)
for(i in 0:(occurrencies-1))
{
  pow<-frequency^i
  pow1<-(1-frequency)^(positions-i)
  bin<-choose(positions, i)
  prob<<-prob+(bin*pow*pow1)
}

Each iteration of this for loop should calculate the binomial probability that, i number of occurrences of the event occur given the frequency. Each iteration also sums up the result. This should result in the prob variable never exceeding 1, but after 7 or so for loop iterations, everything goes to hell and prob excedes 1.

I thought it might be a question of precision digits, so i tried using Rmpfr but to no avail- the same problem persisted.

I was wondering if there are any tips or packages to overcome this situation, or if I'm stuck with this.

回答1:

Following Ben Bolker's advice to see ?pbinom

pbinom(q = occurencies, size = positions, prob = frequency, lower.tail = FALSE)


回答2:

You can avoid your for loop by doing

prob<-0
i    <- 0:(occurrencies-1)
pow  <- frequency^i
pow1 <- (1-frequency)^(positions-i)
bin  <- choose(positions, i)
prob <- cumsum(prob+(bin*pow*pow1))
[1] 0.8906152 0.9937867 0.9997624 0.9999932 0.9999998 1.0000000 1.0000000 1.0000000 1.0000000
[10] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
[19] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
[28] 1.0000000 1.0000000 1.0000000

I don't know if this is your desired result, but surely you can avoid the for loop going this fashion.

See @Ben Bolker's comment and take a look at pbinom function.