What approach to error handling to use with pipes(

2019-01-28 05:42发布

问题:

I'm currently writing some pipes-core/attoparsec plumbing for a small project of mine. I want each parser to give a pipe that awaits ByteString input to the parser and yields any parsed values (restarting the parser). Without error handling it would thus have a type like

parserP :: Monad m => Parser a -> Pipe ByteString a m r

Now, I'm unsure about what to do with parse errors. My current ideas are to:

  • to add the errors to the return type (i.e. return a value in Either ParseError r rather than just r)
  • require the monad to provide the error handling mechanism (i.e. require the monad the pipe is taken over to implement MonadError)
  • force the monad to supply an error mechanism by taking the Pipe over ErrorT e m a for any monad m
  • add parameters, letting the user specify the behaviour (something like (ParseError -> P.Pipe ByteString a m r), and simply bind to the thus provided pipe in case of a parse error)

The first solution seems wrong, as using the return type of a pipe for error handling seems like rather a hack. It, for one thing, makes composition with the pipe uglier and seem to be more or less subsumed by the final solution (apart from possibly losing the ability of letting a downstream pipe being able to recover from an error by using tryAwait and stop awaiting values?).

The second solution seems wrong, though I can't quite put my finger on why. Possibly since it would(could?) also require taking a parameter translating the ParseError into whatever error type the monad has (unless we wish to require the monad to implement MonadError ParseError, which seems like it would result in a lot of book keeping). Finally, I can't seem to remember seeing MonadError around all that much, which would suggest that there is some issue with using it.

The third solution would work in my case, as the pipe will be part of a pipeline with a user specified monad(IO) that is not supposed to care about the parsing errors (it'll parse network data into a format yielded to a user specified type). But it doesn't seem all that elegant, and would, again, (possibly?) result in a lot of book keeping once used in any other context.

I haven't really thought through the final solution, but it seems somewhat convoluted.

I would be grateful for any thoughts on this particular case (I wouldn't be at all surprised if I'm way off and missing something obvious), and for any (more or less relevant) references to discussions on error handling in pipes(-core)/conduits/interatee e.t.c.

EDIT: Another possibility might be to take just a monadic action (rather than a full blown pipe), though I'm not quite sure whether it might just generalise, specialise or even be equivalent to the fourth one.

回答1:

If I may, I think I can help organize everybody's thoughts on this by describing the choice like so. You either:

  1. Layer Pipe within an EitherT/ErrorT:

    EitherT e (Pipe a b m) r

  2. Layer Pipe outside of an EitherT/ErrorT:

    Pipe a b (EitherT e m) r

You want the former approach, which also has the nice property that you can make it an instance of MonadError (if that is your thing).

To understand the difference between the two approaches, the second one throws errors at the level of the entire pipeline. The first one permits error handling at the granularity of individual pipes and correctly handles composed pipes.

Now for some code. I'll use EitherT if you don't mind, since I'm more comfortable with it:

import Control.Error
import Control.Pipe

type PipeE e a b m r = EitherT e (Pipe a b m) r

runPipeE = runPipe . runEitherT

p1 <?< p2 = EitherT (runEitherT p1 <+< runEitherT p2)

Then just use catchT and throwT within a PipeE to your heart's content.

This approach has another advantage, which is that you can selectively apply it to certain segments of the pipeline, but then you are responsible for dealing with the potential exceptional value before composing it with other pipes. You could use that flexibility to use exceptional values of different types for different stages of the pipeline or to not use it at all for stages that can't fail and avoid the overhead of error-checking those stages.