(Edited) How to get random number in Haskell witho

2019-07-07 07:34发布

问题:

I want to have a function that return different stdGen in each call without IO. I've tried to use unsafePerformIO, as the following code.

import System.IO.Unsafe
import System.Random

myStdGen :: StdGen
myStdGen = unsafePerformIO getStdGen

But when I try to call myStdGen in ghci, I always get the same value. Have I abused unsafePerformIO? Or is there any other ways to reach my goal?

EDIT Sorry, I think I should describe my question more precisely.

Actually, I'm implementing a variation of the treap data strutcure, which needs a special 'merge' operation. It relies on some randomness to guarentee amortized O(log n) expected time complexity.

I've tried to use a pair like (Tree, StdGen) to keep the random generator for each treap. When inserting a new data to the treap, I would use random to give random value to the new node, and then update my generator. But I've encountered a problem. I have a function called empty which will return an empty treap, and I used the function myStdGen above to get the random generator for this treap. However, if I have two empty treap, their StdGen would be the same. So after I inserted a data to both treap and when I want to merge them, their random value would be the same, too. Therefore, I lost the randomness which I relies on.

That's why I would like to have a somehow "global" random generator, which yields different StdGen for each call, so that each empty treap could have different StdGen.

回答1:

This is not a good use of unsafePerformIO.

The reason you see the same number repeatedly in GHCi is that GHCi itself does not know that the value is impure, and so remembers the value from the last time you called it. You can type IO commands into the top level of GHCi, so you would expect to see a different value if you just type getStdGen. However, this won't work either, due to an obscure part of the way GHCi works involving not reverting top-level expressions. You can turn this of with :set +r:

> :set +r
> getStdGen
2144783736 1
> getStdGen
1026741422 1

Note that your impure function pretending to be pure will still not work.

> myStdGen
480142475 1
> myStdGen
480142475 1
> myStdGen
480142475 1

You really do not want to go down this route. unsafePerformIO is not supposed to be used this way, and nor is it a good idea at all. There are ways to get what you wanted (like unsafePerformIO randomIO :: Int) but they will not lead you to good things. Instead you should be doing calculations based on random numbers inside a random monad, and running that in the IO monad.

Update

I see from your updatee why you wanted this in the first place.

There are many interesting thoughts on the problem of randomness within otherwise referentially transparent functions in this answer.

Despite the fact that some people advocate the use of unsafePerformIO in this case, it is still a bad idea for a number of reasons which are outlined in various parts of that page. In the end, if a function depends on a source of randomness it is best for that to be specified in it's type, and the easiest way to do that is put it in a random monad. This is still a pure function, just one that takes a generator when it is called. You can provide this generator by asking for a random one in the main IO routine.

A good example of how to use the random monad can be found here.



回答2:

Do I abused unsafePerformIO

Heck yes! The "distinguishing features of a pure function" are

  • No side-effects
  • Referentially transparent, i.e. each subsequent eval of the result must yield the same.

There is in fact a way to achieve your "goal", but the idea is just wrong.

myStdGen :: () -> StdGen
myStdGen () = unsafePerformIO getStdGen

Because this is a (useless) function call instead of a CAF, it'll evaluate the IO action at each call seperately.

Still, I think the compiler is pretty much free to optimise that away, so definitely don't rely on it.

EDIT upon trying I noticed that getStdGen itself always gives the same generator when used within a given process, so even if the above would use more reasonable types it would not work.


Note that correct use of pseudorandomness in algorithms etc. does not need IO everywhere – for instance you can manually seed your StdGen, but then properly propagate it with split etc.. A much nicer way to handle that is with a randomness monad. The program as a whole will then always yield the same result, but internally have all different random numbers as needed to work usefully.

Alternatively, you can obtain a generator from IO, but still write your algorithm in a pure random monad rather than IO.


There's another way to obtain "randomness" in a completely pure algorithm: require the input to be Hashable! Then, you can effectively use any function argument as a random seed. This is a bit of a strange solution, but might work for your treap application (though I reckon some people would not classify it as a treap anymore, but as a special kind of hashmap).



回答3:

Yes, you have abused unsafePerformIO. There are very few valid reasons to use unsafePerformIO, such as when interfacing with a C library, and it's also used in the implementation of a handful of core libraries (I think ST being one of them). In short, don't use unsafePerformIO unless you're really really sure of what you're doing. It is not meant for generating random numbers.

Recall that functions in Haskell have to be pure, meaning that they only depend on their inputs. This means that you can have a pure function that generates a "random" number, but that number is dependent on the random generator you pass to it, you could do something like

myStdGen :: StdGen
myStdGen = mkStdGen 42

Then you could do

randomInt :: StdGen -> (Int, StdGen)
randomInt g = random

But then you must use the new StdGen returned from this function moving forward, or you will always get the same output from randomInt.


So you may be wondering, how do you cleanly generate random numbers without resorting to IO? The answer is the State monad. It looks similar to

newtype State s a = State { runState :: s -> (a, s) }

And its monad instance looks like

instance Monad (State s) where
    return a = State $ \s -> (a, s)
    (State m) >>= f = State $ \s -> let (a, newState) = m s
                                        (State g)     = f a
                                    in g newState

It's a little confusing to look at the first time, but essentially all the state monad does is fancy function composition. See LYAH for a more detailed explanation. What's important to note here is that the type of s does not change between steps, just the a parameter can change.

You'll notice that s -> (a, s) looks a lot like our function StdGen -> (Int, StdGen), with s ~ StdGen and a ~ Int. That means that if we did

randomIntS :: State StdGen Int
randomIntS = State randomInt

Then we could do

twoRandInts :: State StdGen (Int, Int)
twoRandInts = do
    a <- randomIntS
    b <- randomIntS
    return (a, b)

Then it can be run by supplying an initial state:

main = do
    g <- getStdGen
    print $ runState twoRandInts g

The StdGen still comes out of IO, but then all the logic itself occurs within the state monad purely.