In what sense is the IO Monad pure?

2019-01-12 22:25发布

问题:

I've had the IO monad described to me as a State monad where the state is "the real world". The proponents of this approach to IO argue that this makes IO operations pure, as in referentially transparent. Why is that? From my perspective it appears that code inside the IO monad have plenty of observable side effects. Also, isn't it possible to describe pretty much any non-pure function like a function of the real world? For example, can't we think of, say, C's malloc as being a function that takes a RealWorld and an Int and returns a pointer and a RealWorld, only just like in the IO monad the RealWorld is implicit?

Note: I know what a monad is and how it's used. Please don't respond with a link to a random monad tutorial unless it specifically adresses my question.

回答1:

I think the best explanation I've heard was actually fairly recently on SO. IO Foo is a recipe for creating a Foo. Another common, more literal, way of saying this is that it is a "program that produces a Foo". It can be executed (many times) to create a Foo or die trying. The execution of the recipe/program is what we ultimately want (otherwise, why write one?), but the thing that is represented by an IO action in our code is the recipe itself.

That recipe is a pure value, in the same exact sense that a String is a pure value. Recipes can be combined and manipulated in interesting, sometimes astonishing, ways, but the many ways these recipes can be combined (except for the blatantly non-pure unsafePerformIO, unsafeCoerce, etc.) are all completely referentially transparent, deterministic, and all that nice stuff. The resulting recipe depends in absolutely no way whatsoever on the state of anything other than the recipes that it was built up from.



回答2:

Also, isn't it possible to describe pretty much any non-pure function like a function of the real world? For example, can't we think of, say, C's malloc as being a function that takes a RealWorld and an Int and returns a pointer and a RealWorld, only just like in the IO monad the RealWorld is implicit?

For sure ...

The whole idea of functional programming is to describe programs as a combination of small, independent calculations building up bigger computations.

Having these independent calculations, you'll have lots of benefits, reaching from concise programs to efficient and efficiently parallelizable codes, laziness up to the the rigorous guarantee that control flows as intended - with no chance of interference or corruption of arbitrary data.

Now - in some cases (like IO), we need impure code. Calculations involving such operations cannot be independent - they could mutate arbitrary data of another computation.

The point is - Haskell is always pure, IO doesn't change this.

So, our impure, non-independent codes have to get a common dependency - we have to pass a RealWorld. So whatever stateful computation we want to run, we have to pass this RealWorld thing to apply our changes to - and whatever other stateful computation wants to see or make changes has to know the RealWorld too.

Whether this is done explicitly or implicitly through the IO monad is irrelevant. You build up a Haskell program as a giant computation that transforms data, and one part of this data is the RealWorld.

Once the initial main :: IO () gets called when your program is run with the current real world as a parameter, this real world gets carried through all impure calculations involved, just as data would in a State. That's what monadic >>= (bind) takes care of.

And where the RealWorld doesn't get (as in pure computations or without any >>=-ing to main), there is no chance of doing anything with it. And where it does get, that happened by purely functional passing of an (implicit) parameter. That's why

let foo = putStrLn "AAARGH" in 42

does absolutely nothing - and why the IO monad - like anything else - is pure. What happens inside this code can of course be impure, but it's all caught inside, with no chance of interfering with non-connected computations.



回答3:

Suppose we have something like:

animatePowBoomWhenHearNoiseInMicrophone :: TimeDiff -> Sample -> IO ()
animatePowBoomWhenHearNoiseInMicrophone
    levelWeightedAverageHalfLife levelThreshord = ...

programA :: IO ()
programA = animatePowBoomWhenHearNoiseInMicrophone 3 10000

programB :: IO ()
programB = animatePowBoomWhenHearNoiseInMicrophone 3 10000

Here's a point of view:

animatePowBoomWhenHearNoiseInMicrophone is a pure function in the sense that its results for same input, programA and programB, are exactly the same. You can do main = programA or main = programB and it would be exactly the same.

animatePowBoomWhenHearNoiseInMicrophone is a function receiving two arguments and resulting in a description of a program. The Haskell runtime can execute this description if you set main to it or otherwise include it in main via binding.

What is IO? IO is a DSL for describing imperative programs, encoded in "pure-haskell" data structures and functions.

"complete-haskell" aka GHC is an implementation of both "pure-haskell", and an imperative implementation of an IO decoder/executer.



回答4:

It quite simply comes down to extensional equality:

If you were to call getLine twice, then both calls would return an IO String which would look exactly the same on the outside each time. If you were to write a function to take 2 IO Strings and return a Bool to signal a detected difference between them both, it would not be possible to detect any difference from any observable properties. It could not ask any other function whether they are equal and any attempt at using >>= must also return something in IO which all are equall externally.



回答5:

I'll let Martin Odersky answer this

The IO monad does not make a function pure. It just makes it obvious that it's impure.

Sounds clear enough.



回答6:

Even though its title is a bit weird (in that it doesn't precisely match the content) the following haskell-cafe thread contains a nice discussion about different IO models for Haskell.

http://www.mail-archive.com/haskell-cafe@haskell.org/msg79613.html



回答7:

Well, this is what we have been taught at college -

Function is referentially transparent when it always returns the same value for specified input (or the same expression always evaluates to same value in the same context). Therefore, for example getChar would not be referentially transparent if it had type signature just () -> Char or Char, because you can get different results if you call this function multiple times with the same argument.

But, if you introduce IO monad, then getChar can have type IO Char and this type has only one single value - IO Char. So getChar allways reutrns the same value, no matter on which key user really pressed.

But you are still able to "get" the underlying value from this IO Char thing. Well, not really get, but pass to another function using bind operator (>>=), so you can work with the Char that user entered further in your program.



回答8:

The inventor of Monads says: In an impure language, an operation like tick would be represented by a function of type () -> (). The spurious argument () is required to delay the effect until the function is applied, and since the output type is () one may guess that the function's purpose lies in a side effect. In contrast, here tick has type M (): no spurious argument is needed, and the appearance of M explicitly indicates what sort of effect may occur.

I fail to understand how M () makes the empty argument list, (), less spurious but wadler is pretty clear that Monads just indicate a kind of side-effect, they do not eliminate it. Haskel followers seem to deceive us and themselves when state that monads eliminate the impurity.