How to “debug” Haskell with printfs?

2019-01-16 22:39发布

问题:

coming from the Ocaml community, I'm trying to learn a bit of Haskell. The transition goes quite well but I'm a bit confused with debugging. I used to put (lots of) "printf" in my ocaml code, to inspect some intermediate values, or as flag to see where the computation exactly failed.

Since printf is an IO action, do I have to lift all my haskell code inside the IO monad to be able to this kind of debugging ? Or is there a better way to do this (I really don't want to do it by hand if it can be avoided)

I also find the trace function : http://www.haskell.org/haskellwiki/Debugging#Printf_and_friends which seems exactly what I want, but I don't understand it's type: there is no IO anywhere! Can someone explain me the behaviour of the trace function ?

回答1:

trace is the easiest to use method for debugging. It's not in IO exactly for the reason you pointed: no need to lift your code in the IO monad. It's implemented like this

trace :: String -> a -> a
trace string expr = unsafePerformIO $ do
    putTraceMsg string
    return expr

So there is IO behind the scenes but unsafePerformIO is used to escape out of it. That's a function which potentially breaks referential transparency which you can guess looking at its type IO a -> a and also its name.



回答2:

trace is simply made impure. The point of the IO monad is to preserve purity (no IO unnoticed by the type system) and define the order of execution of statements, which would otherwise be practically undefined through lazy evaluation.

On own risk however, you can nevertheless hack together some IO a -> a, i.e. perform impure IO. This is a hack and of course "suffers" from lazy evaluation, but that's what trace simply does for the sake of debugging.

Nevertheless though, you should probably go other ways for debugging:

  1. Reducing the need for debugging intermediate values

    • Write small, reusable, clear, generic functions whose correctness is obvious.
    • Combine the correct pieces to greater correct pieces.
    • Write tests or try out pieces interactively.
  2. Use breakpoints etc. (compiler-based debugging)

  3. Use generic monads. If your code is monadic nevertheless, write it independent of a concrete monad. Use type M a = ... instead of plain IO .... You can afterwards easily combine monads through transformers and put a debugging monad on top of it. Even if the need for monads is gone, you could just insert Identity a for pure values.



回答3:

For what it's worth, there are actually two kinds of "debugging" at issue here:

  • Logging intermediate values, such as the value a particular subexpression has on each call into a recursive function
  • Inspecting the runtime behavior of the evaluation of an expression

In a strict imperative language these usually coincide. In Haskell, they often do not:

  • Recording intermediate values can change the runtime behavior, such as by forcing the evaluation of terms that would otherwise be discarded.
  • The actual process of computation can dramatically differ from the apparent structure of an expression due to laziness and shared subexpressions.

If you just want to keep a log of intermediate values, there are many ways to do so--for instance, rather than lifting everything into IO, a simple Writer monad will suffice, this being equivalent to making functions return a 2-tuple of their actual result and an accumulator value (some sort of list, typically).

It's also not usually necessary to put everything into the monad, only the functions that need to write to the "log" value--for instance, you can factor out just the subexpressions that might need to do logging, leaving the main logic pure, then reassemble the overall computation by combining pure functions and logging computations in the usual manner with fmaps and whatnot. Keep in mind that Writer is kind of a sorry excuse for a monad: with no way to read from the log, only write to it, each computation is logically independent of its context, which makes it easier to juggle things around.

But in some cases even that's overkill--for many pure functions, just moving subexpressions to the toplevel and trying things out in the REPL works pretty well.

If you want to actually inspect run-time behavior of pure code, however--for instance, to figure out why a subexpression diverges--there is in general no way to do so from other pure code--in fact, this is essentially the definition of purity. So in that case, you have no choice but to use tools that exist "outside" the pure language: either impure functions such as unsafePerformPrintfDebugging--errr, I mean trace--or a modified runtime environment, such as the GHCi debugger.



回答4:

trace also tends to over-evaluate its argument for printing, losing a lot of the benefits of laziness in the process.



回答5:

If you can wait until the program is finished before studying the output, then stacking a Writer monad is the classic approach to implementing a logger. I use this here to return a result set from impure HDBC code.



回答6:

Well, since whole Haskell is built around principle of lazy evaluation (so that order of calculations is in fact non-deterministic), use of printf's make very little sense in it.

If REPL+inspect resulting values is really not enough for your debugging, wrapping everything into IO is the only choice (but it's not THE RIGHT WAY of Haskell programming).