Accumulating errors with EitherT

2019-02-13 15:31发布

问题:

I have the following little mini-sample application of a web API that takes a huge JSON document and is supposed to parse it in pieces and report error messages for each of the pieces.

Following code is a working example of that using EitherT (and the errors package). However, the problem is that EitherT breaks the computation on the first Left encountered and just returns the first "error" it sees. What I would like is a list of error messages, all that are possible to produce. For instance, if the first line in runEitherT fails then there's nothing more that can be done. But if the second line fails then we can still try to run subsequent lines because they have no data dependency on the second line. So we could theoretically produce more (not necessarily all) error messages in one go.

Is it possible to run all the computations lazily and return all the error messages we can find out?

{-# LANGUAGE OverloadedStrings #-}

module Main where

import Data.ByteString.Lazy.Char8 (pack)
import Web.Scotty as S
import Network.Wai.Middleware.RequestLogger
import Data.Aeson
import Data.Aeson.Types
import Control.Lens hiding ((.=), (??))
import Data.Aeson.Lens
import qualified Data.Text as T
import Control.Error
import Control.Applicative
import qualified Data.HashMap.Strict as H
import Network.HTTP.Types

data TypeOne = TypeOne T.Text TypeTwo TypeThree
  deriving (Show)

data TypeTwo = TypeTwo Double
  deriving (Show)

data TypeThree = TypeThree Double
  deriving (Show)

main :: IO ()
main = scotty 3000 $ do
  middleware logStdoutDev

  post "/pdor" $ do
    api_key <- param "api_key"
    input   <- param "input"

    typeOne <- runEitherT $ do
      result       <- (decode (pack input) :: Maybe Value) ?? "Could not parse. Input JSON document is malformed"
      typeTwoObj   <- (result ^? key "typeTwo")            ?? "Could not find key typeTwo in JSON document."
      typeThreeObj <- (result ^? key "typeThree")          ?? "Could not find key typeThree in JSON document."
      name         <- (result ^? key "name" . _String)     ?? "Could not find key name in JSON document."
      typeTwo      <- hoistEither $ prependLeft "Error when parsing TypeTwo: " $ parseEither jsonTypeTwo typeTwoObj
      typeThree    <- hoistEither $ prependLeft "Error when parsing TypeThree: " $ parseEither jsonTypeThree typeThreeObj

      return $ TypeOne name typeTwo typeThree

    case typeOne of
      Left errorMsg -> do
        _ <- status badRequest400
        S.json $ object ["error" .= errorMsg]
      Right _ ->
        -- do something with the parsed Haskell type
        S.json $ object ["api_key" .= (api_key :: String), "message" .= ("success" :: String)]

prependLeft :: String -> Either String a -> Either String a
prependLeft msg (Left s) = Left (msg ++ s)
prependLeft _ x = x

jsonTypeTwo :: Value -> Parser TypeTwo
jsonTypeTwo (Object v) = TypeTwo <$> v .: "val"
jsonTypeTwo _ = fail $ "no data present for TypeTwo"

jsonTypeThree :: Value -> Parser TypeThree
jsonTypeThree (Object v) = TypeThree <$> v .: "val"
jsonTypeThree _ = fail $ "no data present for TypeThree"

Also open to refactoring suggestions if anyone has some.

回答1:

As I mentioned in a comment, you have at least 2 ways of accumulating error. Below I elaborate on those. We'll need these imports:

import Control.Applicative
import Data.Monoid
import Data.These

TheseT monad transformer

Disclaimer: TheseT is called ChronicleT in these package.

Take a look at the definition of These data type:

data These a b = This a | That b | These a b

Here This and That correspond to Left and Right of Either data type. These data constructor is what enables accumulating capability for Monad instance: it contains both result (of type b) and a collection of previous errors (collection of type a).

Taking advantage of already existing definition of These data type we can easily create ErrorT-like monad transformer:

newtype TheseT e m a = TheseT {
  runTheseT :: m (These e a)
}

TheseT is an instance of Monad in the following way:

instance Functor m => Functor (TheseT e m) where
  fmap f (TheseT m) = TheseT (fmap (fmap f) m)

instance (Monoid e, Applicative m) => Applicative (TheseT e m) where
  pure x = TheseT (pure (pure x))
  TheseT f <*> TheseT x = TheseT (liftA2 (<*>) f x)

instance (Monoid e, Monad m) => Monad (TheseT e m) where
  return x = TheseT (return (return x))
  m >>= f = TheseT $ do
    t <- runTheseT m
    case t of
      This  e   -> return (This e)
      That    x -> runTheseT (f x)
      These _ x -> do
        t' <- runTheseT (f x)
        return (t >> t')  -- this is where errors get concatenated

Applicative accumulating ErrorT

Disclaimer: this approach is somewhat easier to adapt since you already work in m (Either e a) newtype wrapper, but it works only in Applicative setting.

If the actual code only uses Applicative interface we can get away with ErrorT changing its Applicative instance.

Let's start with a non-transformer version:

data Accum e a = ALeft e | ARight a

instance Functor (Accum e) where
  fmap f (ARight x) = ARight (f x)
  fmap _ (ALeft e)  = ALeft e

instance Monoid e => Applicative (Accum e) where
  pure = ARight
  ARight f <*> ARight x = ARight (f x)
  ALeft e  <*> ALeft e' = ALeft (e <> e')
  ALeft e  <*> _        = ALeft e
  _        <*> ALeft e  = ALeft e

Note that when defining <*> we know if both sides are ALefts and thus can perform <>. If we try to define corresponding Monad instance we fail:

instance Monoid e => Monad (Accum e) where
  return = ARight
  ALeft e >>= f = -- we can't apply f

So the only Monad instance we might have is that of Either. But then ap is not the same as <*>:

Left a <*>  Left b  ≡  Left (a <> b)
Left a `ap` Left b  ≡  Left a

So we only can use Accum as Applicative.

Now we can define Applicative transformer based on Accum:

newtype AccErrorT e m a = AccErrorT {
  runAccErrorT :: m (Accum e a)
}

instance (Functor m) => Functor (AccErrorT e m) where
  fmap f (AccErrorT m) = AccErrorT (fmap (fmap f) m)

instance (Monoid e, Applicative m) => Applicative (AccErrorT e m) where
  pure x = AccErrorT (pure (pure x))
  AccErrorT f <*> AccErrorT x = AccErrorT (liftA2 (<*>) f x)

Note that AccErrorT e m is essentially Compose m (Accum e).


EDIT:

AccError is known as AccValidation in validation package.



回答2:

We could actually code this as an arrow (Kleisli transformer).

newtype EitherAT x m a b = EitherAT { runEitherAT :: a -> m (Either x b) }

instance Monad m => Category EitherAT x m where
  id = EitherAT $ return . Right
  EitherAT a . EitherAT b
       = EitherAT $ \x -> do
              ax <- a x
              case ax of Right y -> b y
                         Left e  -> return $ Left e

instance (Monad m, Semigroup x) => Arrow EitherAT x m where
  arr f = EitherAT $ return . Right . f
  EitherAT a *** EitherAT b = EitherAT $ \(x,y) -> do
      ax <- a x
      by <- b y
      return $ case (ax,by) of
        (Right x',Right y') -> Right (x',y')
        (Left e  , Left f ) -> Left $ e <> f
        (Left e  , _      ) -> Left e
        (  _     , Left f ) ->        Left f
  first = (***id)

Only, that would violate the arrow laws (you can't rewrite a *** b to first a >>> second b without losing a's error information). But if you basically see all the Lefts as merely a debugging device, you might argue it's okay.