To get acquainted with unsafePerformIO
(how to use it and when to use it), I've implemented a module for generating unique values.
Here's what I have:
module Unique (newUnique) where
import Data.IORef
import System.IO.Unsafe (unsafePerformIO)
-- Type to represent a unique thing.
-- Show is derived just for testing purposes.
newtype Unique = U Integer
deriving Show
-- I believe this is the Haskell'98 derived instance, but
-- I want to be explicit, since its Eq instance is the most
-- important part of Unique.
instance Eq Unique where
(U x) == (U y) = x == y
counter :: IORef Integer
counter = unsafePerformIO $ newIORef 0
updateCounter :: IO ()
updateCounter = do
x <- readIORef counter
writeIORef counter (x+1)
readCounter :: IO Integer
readCounter = readIORef counter
newUnique' :: IO Unique
newUnique' = do { x <- readIORef counter
; writeIORef counter (x+1)
; return $ U x }
newUnique :: () -> Unique
newUnique () = unsafePerformIO newUnique'
To my delight, the package called Data.Unique
chose the same datatype as I did; on the other hand, they chose the type newUnique :: IO Unique
, but I want to stay out of IO
if possible.
Is this implementation dangerous? Could it possibly lead GHC to change the semantics of a program which uses it?
The purpose of
unsafePerformIO
is when your function does some action internally, but has no side effects that an observer would notice.For example, a function that take a vector, copies it, quicksorts the copy in-place, then returns the copy.(see comments) Each of these operations has side effects, and so is inIO
, but the overall result does not.newUnique
must be anIO
action because it generates something different every time. This is basically the definition ofIO
, it means a verb, as opposed to functions which are adjectives. A function will always return the same result for the same arguments. This is called referential transparency.For valid uses of
unsafePerformIO
, see this question.See an another example how this fails:
With this code the effect is the same as in ntc2's example: correct with -O0, but incorrect with -O. But in this code there is no "common subexpression to eliminate".
What's actually happening here is that the
newUnique ()
expression is "floated out" to the top-level, because it doesn't depend on the function's parameters. In GHC speak this is-ffull-laziness
(on by default with-O
, can be turned off with-O -fno-full-laziness
).So the code effectively becomes this:
And here helperworker is a thunk that can only be evaluated once.
With the already recommended NOINLINE pragmas if you add
-fno-full-laziness
to the command line, then it works as expected.Treat
unsafePerformIO
as a promise to the compiler. It says "I promise that you can treat this IO action as if it were a pure value and nothing will go wrong". It's useful because there are times you can build a pure interface to a computation implemented with impure operations, but it's impossible for the compiler to verify when this is the case; insteadunsafePerformIO
allows you to put your hand on your heart and swear that you have verified that the impure computation is actually pure, so the compiler can simply trust that it is.In this case that promise is false. If
newUnique
were a pure function thenlet x = newUnique () in (x, x)
and(newUnique (), newUnique ())
would be equivalent expressions. But you would want these two expressions to have different results; a pair of duplicates of the sameUnique
value in one case, and a pair of two differentUnique
values in the other. With your code, there's really no way to say what either expression means. They can only be understood by considering the actual sequence of operations the program will carry out at runtime, and control over that is exactly what you're relinquishing when you useunsafePerformIO
.unsafePerformIO
says it doesn't matter whether either expression is compiled as one execution ofnewUnique
or two, and any implementation of Haskell is free to choose whatever it likes each and every time it encounters such code.Yes, your module is dangerous. Consider this example:
Compile and run:
Compile with optimization and run:
Uh-oh!
Adding
{-# NOINLINE counter #-}
and{-# NOINLINE newUnique #-}
does not help, so I'm not actually sure what's happening here ...1st UPDATE
Looking at the GHC core, I see that @LambdaFairy was correct that constant subexpression elimination (CSE) caused my
newUnique ()
expressions to be lifted. However, preventing CSE with-fno-cse
and adding{-# NOINLINE counter #-}
toUnique.hs
is not sufficient to make the optimized program print the same as the unoptimized program!In particular, it seems that. Does anyone understand why?counter
is inlined even with theNOINLINE
pragma inUnique.hs
I've uploaded the full versions of the following core files at https://gist.github.com/ntc2/6986500.
The (relevant) core for
main
when compiling with-O
:Note that the
newUnique ()
calls have been lifted and bound tomain3
.And now when compiling with
-O -fno-cse
:Note that
main3
andmain5
are the two separatenewUnique ()
calls.However:
Looking at the core for this modified
Unique.hs
:it seems that(2nd update: wrong!counter
is being inlined ascounter_rag
, despite theNOINLINE
pragmacounter_rag
is not marked with[InlPrag=NOINLINE]
, but that doesn't mean it's been inlined; rather,counter_rag
is just the munged name ofcounter
); theNOINLINE
fornewUnique
is respected though:What's going on here?
2nd UPDATE
User @errge figured it out. Looking more carefully that the last core output pasted above, we see that most of the body of
Unique.newUnique
has been floated to the top level aslvl4_rvj
. However,lvl4_rvj
is a constant expression, not a function, and so it's only evaluated once, explaining the repeatedU 0
output bymain
.Indeed:
I don't understand exactly what the
-ffull-laziness
optimization does -- the GHC docs talk about floating let bindings, but the body oflvl4_rvj
does not appear to have been a let binding -- but we can at least compare the above core with the core generated with-fno-full-laziness
and see that now the body is not lifted:Here
counter_rfj
corresponds tocounter
again, and we see the difference is that the body ofUnique.newUnique
has not been lifted, and so the reference updating (readMutVar
,writeMutVar
) code will be run each timeUnique.newUnique
is called.I've updated the gist to include the new
-fno-full-laziness
core file. The earlier core files were generated on a different computer, so some minor differences here are unrelated to-fno-full-laziness
.