可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I want to have a function that return different stdGen
in each call without IO.
I've tried to use unsafePerformIO
, as the following code.
import System.IO.Unsafe
import System.Random
myStdGen :: StdGen
myStdGen = unsafePerformIO getStdGen
But when I try to call myStdGen
in ghci, I always get the same value. Have I abused unsafePerformIO
? Or is there any other ways to reach my goal?
EDIT
Sorry, I think I should describe my question more precisely.
Actually, I'm implementing a variation of the treap data strutcure, which needs a special 'merge' operation. It relies on some randomness to guarentee amortized O(log n) expected time complexity.
I've tried to use a pair like (Tree, StdGen)
to keep the random generator for each treap. When inserting a new data to the treap, I would use random
to give random value to the new node, and then update my generator. But I've encountered a problem. I have a function called empty
which will return an empty treap, and I used the function myStdGen
above to get the random generator for this treap. However, if I have two empty treap, their StdGen
would be the same. So after I inserted a data to both treap and when I want to merge them, their random value would be the same, too. Therefore, I lost the randomness which I relies on.
That's why I would like to have a somehow "global" random generator, which yields different StdGen
for each call, so that each empty treap could have different StdGen
.
回答1:
This is not a good use of unsafePerformIO
.
The reason you see the same number repeatedly in GHCi is that GHCi itself does not know that the value is impure, and so remembers the value from the last time you called it. You can type IO commands into the top level of GHCi, so you would expect to see a different value if you just type getStdGen
. However, this won't work either, due to an obscure part of the way GHCi works involving not reverting top-level expressions. You can turn this of with :set +r
:
> :set +r
> getStdGen
2144783736 1
> getStdGen
1026741422 1
Note that your impure function pretending to be pure will still not work.
> myStdGen
480142475 1
> myStdGen
480142475 1
> myStdGen
480142475 1
You really do not want to go down this route. unsafePerformIO
is not supposed to be used this way, and nor is it a good idea at all. There are ways to get what you wanted (like unsafePerformIO randomIO :: Int
) but they will not lead you to good things. Instead you should be doing calculations based on random numbers inside a random monad, and running that in the IO monad.
Update
I see from your updatee why you wanted this in the first place.
There are many interesting thoughts on the problem of randomness within otherwise referentially transparent functions in this answer.
Despite the fact that some people advocate the use of unsafePerformIO
in this case, it is still a bad idea for a number of reasons which are outlined in various parts of that page. In the end, if a function depends on a source of randomness it is best for that to be specified in it's type, and the easiest way to do that is put it in a random monad. This is still a pure function, just one that takes a generator when it is called. You can provide this generator by asking for a random one in the main IO
routine.
A good example of how to use the random monad can be found here.
回答2:
Do I abused unsafePerformIO
Heck yes! The "distinguishing features of a pure function" are
- No side-effects
- Referentially transparent, i.e. each subsequent eval of the result must yield the same.
There is in fact a way to achieve your "goal", but the idea is just wrong.
myStdGen :: () -> StdGen
myStdGen () = unsafePerformIO getStdGen
Because this is a (useless) function call instead of a CAF, it'll evaluate the IO
action at each call seperately.
Still, I think the compiler is pretty much free to optimise that away, so definitely don't rely on it.
EDIT upon trying I noticed that getStdGen
itself always gives the same generator when used within a given process, so even if the above would use more reasonable types it would not work.
Note that correct use of pseudorandomness in algorithms etc. does
not need IO everywhere – for instance you can manually seed your
StdGen
, but then properly propagate it with
split
etc.. A much nicer way to handle that is with a randomness monad. The program as a whole will then always yield the same result, but internally have all different random numbers as needed to work usefully.
Alternatively, you can obtain a generator from IO, but still write your algorithm in a pure random monad rather than IO
.
There's another way to obtain "randomness" in a completely pure algorithm: require the input to be Hashable
! Then, you can effectively use any function argument as a random seed. This is a bit of a strange solution, but might work for your treap application (though I reckon some people would not classify it as a treap anymore, but as a special kind of hashmap).
回答3:
Yes, you have abused unsafePerformIO
. There are very few valid reasons to use unsafePerformIO
, such as when interfacing with a C library, and it's also used in the implementation of a handful of core libraries (I think ST
being one of them). In short, don't use unsafePerformIO
unless you're really really sure of what you're doing. It is not meant for generating random numbers.
Recall that functions in Haskell have to be pure, meaning that they only depend on their inputs. This means that you can have a pure function that generates a "random" number, but that number is dependent on the random generator you pass to it, you could do something like
myStdGen :: StdGen
myStdGen = mkStdGen 42
Then you could do
randomInt :: StdGen -> (Int, StdGen)
randomInt g = random
But then you must use the new StdGen
returned from this function moving forward, or you will always get the same output from randomInt
.
So you may be wondering, how do you cleanly generate random numbers without resorting to IO
? The answer is the State
monad. It looks similar to
newtype State s a = State { runState :: s -> (a, s) }
And its monad instance looks like
instance Monad (State s) where
return a = State $ \s -> (a, s)
(State m) >>= f = State $ \s -> let (a, newState) = m s
(State g) = f a
in g newState
It's a little confusing to look at the first time, but essentially all the state monad does is fancy function composition. See LYAH for a more detailed explanation. What's important to note here is that the type of s
does not change between steps, just the a
parameter can change.
You'll notice that s -> (a, s)
looks a lot like our function StdGen -> (Int, StdGen)
, with s ~ StdGen
and a ~ Int
. That means that if we did
randomIntS :: State StdGen Int
randomIntS = State randomInt
Then we could do
twoRandInts :: State StdGen (Int, Int)
twoRandInts = do
a <- randomIntS
b <- randomIntS
return (a, b)
Then it can be run by supplying an initial state:
main = do
g <- getStdGen
print $ runState twoRandInts g
The StdGen
still comes out of IO
, but then all the logic itself occurs within the state monad purely.