coming from the Ocaml community, I'm trying to learn a bit of Haskell. The transition goes quite well but I'm a bit confused with debugging. I used to put (lots of) "printf" in my ocaml code, to inspect some intermediate values, or as flag to see where the computation exactly failed.
Since printf is an IO action, do I have to lift all my haskell code inside the IO monad to be able to this kind of debugging ? Or is there a better way to do this (I really don't want to do it by hand if it can be avoided)
I also find the trace function : http://www.haskell.org/haskellwiki/Debugging#Printf_and_friends which seems exactly what I want, but I don't understand it's type: there is no IO anywhere! Can someone explain me the behaviour of the trace function ?
trace
also tends to over-evaluate its argument for printing, losing a lot of the benefits of laziness in the process.Well, since whole Haskell is built around principle of lazy evaluation (so that order of calculations is in fact non-deterministic), use of printf's make very little sense in it.
If REPL+inspect resulting values is really not enough for your debugging, wrapping everything into IO is the only choice (but it's not THE RIGHT WAY of Haskell programming).
If you can wait until the program is finished before studying the output, then stacking a Writer monad is the classic approach to implementing a logger. I use this here to return a result set from impure HDBC code.
trace
is the easiest to use method for debugging. It's not inIO
exactly for the reason you pointed: no need to lift your code in theIO
monad. It's implemented like thisSo there is IO behind the scenes but
unsafePerformIO
is used to escape out of it. That's a function which potentially breaks referential transparency which you can guess looking at its typeIO a -> a
and also its name.trace
is simply made impure. The point of theIO
monad is to preserve purity (no IO unnoticed by the type system) and define the order of execution of statements, which would otherwise be practically undefined through lazy evaluation.On own risk however, you can nevertheless hack together some
IO a -> a
, i.e. perform impure IO. This is a hack and of course "suffers" from lazy evaluation, but that's what trace simply does for the sake of debugging.Nevertheless though, you should probably go other ways for debugging:
Reducing the need for debugging intermediate values
Use breakpoints etc. (compiler-based debugging)
Use generic monads. If your code is monadic nevertheless, write it independent of a concrete monad. Use
type M a = ...
instead of plainIO ...
. You can afterwards easily combine monads through transformers and put a debugging monad on top of it. Even if the need for monads is gone, you could just insertIdentity a
for pure values.For what it's worth, there are actually two kinds of "debugging" at issue here:
In a strict imperative language these usually coincide. In Haskell, they often do not:
If you just want to keep a log of intermediate values, there are many ways to do so--for instance, rather than lifting everything into
IO
, a simpleWriter
monad will suffice, this being equivalent to making functions return a 2-tuple of their actual result and an accumulator value (some sort of list, typically).It's also not usually necessary to put everything into the monad, only the functions that need to write to the "log" value--for instance, you can factor out just the subexpressions that might need to do logging, leaving the main logic pure, then reassemble the overall computation by combining pure functions and logging computations in the usual manner with
fmap
s and whatnot. Keep in mind thatWriter
is kind of a sorry excuse for a monad: with no way to read from the log, only write to it, each computation is logically independent of its context, which makes it easier to juggle things around.But in some cases even that's overkill--for many pure functions, just moving subexpressions to the toplevel and trying things out in the REPL works pretty well.
If you want to actually inspect run-time behavior of pure code, however--for instance, to figure out why a subexpression diverges--there is in general no way to do so from other pure code--in fact, this is essentially the definition of purity. So in that case, you have no choice but to use tools that exist "outside" the pure language: either impure functions such as
unsafePerformPrintfDebugging
--errr, I meantrace
--or a modified runtime environment, such as the GHCi debugger.