Average of large number of Dice Rolls in Haskell

2019-06-24 07:50发布

问题:

In an attempt to learn Haskell better, I'm trying to write a program that displays the average value of the sum of 2 die, rolled X number of times. This is fairly simple in C, Java, Python... but I'm stuck in Haskell. Here's a naive attempt:

import System.Random

main = do
    g <- getStdGen
    let trials = 10000000
    let rolls = take trials (randomRs (2, 12) g :: [Int])
    let average = div (sum rolls) trials
    print average

For low number of trials, the program works. But when I run this code with ten million trials, I get an error:

Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.

There's got to be a better way to write this program. In the C, Java, and Python versions, this is a simple task. I've looked at this post (and understand about 75% of the material), but when I adapt that code to this situation, summing a sequence of R [Int] doesn't work (and I'm not sure how to 'unwrap' the [Int]). What am I doing wrong? What's the right way? How do I reach random number enlightenment in Haskell?

Edit: in addition to the answer selected, as rtperson points out below, the modeling of 2 dice is incorrect; it should really be the sum of two independent rolls from 1 to 6.

回答1:

sum is no good to sum a long list, it runs in linear space. Try this strict version of sum:

sum' = foldl' (+) 0

foldl' is defined in Data.List.

EDIT More information can be found in this HaskellWiki article.



回答2:

Actually, the probabilities are modeled incorrectly here. As the code is written, there's the same possibility of getting 2 through 12. But that's not how dice work. Of the 12 possible outcomes, there's only one way to get 2 (via 1 and 1) and 12 (through 6 and 6). But there are 6 ways to get 7 (1-6, 2-5, 3-4, 4-3, 5-2, 6-1). Modeling the roll of two dice, rather than a single 2 to 12 chance, would give the correct expected value of 7.

However -- and here's where your hair will really start to curl -- you can't simply do something like the following:

let rolls1 = take trials (randomRs (1, 6) g :: [Int])
let rolls2 = take trials (randomRs (1, 6) g :: [Int])

Because rolls1 and rolls2 will yield the same results.

*Main> let rolls = zip rolls1 rolls2
*Main> take 10 rolls
[(3,3),(4,4),(5,5),(3,3),(5,5),(1,1),(3,3),(1,1),(3,3),(3,3)]

So your result will always be even, and hence remain the incorrect answer 6.

To get two distinctly random lists, you'll have to produce a new StdGen:

import System.Random
import Data.List

main = do
    g <- getStdGen
    b <- newStdGen   -- calls "split" on the global generator
    let trials = 10000000
    let rolls1 = take trials (randomRs (1, 6) g :: [Int])
    let rolls2 = take trials (randomRs (1, 6) b :: [Int])
    let rolls = zipWith (+) rolls1 rolls2
    let average = div (foldl' (+) 0 rolls) trials
    print average

(Note that I'm specifically avoiding using the State monad for this. You're welcome.)

This takes longer than the original version, but the laziness of zipWith combined with the strictness of foldl' ensure that you won't overflow the stack.

If you're just trying to fix the overflow, n.m.'s answer is correct. The strictness of foldl' will fix the performance issue. But in this case, getting the correct answer will also teach you something about random number generation in Haskell.