处理在iteratee库异常没有错误状态(Handling exceptions in an ite

2019-07-03 22:07发布

站内文章 / 前沿技术

46 0

我命由我不由天

女 | 书童

私信

我试图写一个枚举用于从读取文件一行行java.io.BufferedReader使用Scalaz 7的iteratee库，它目前只提供了一个（非常慢）枚举java.io.Reader 。

我遇到的问题是相关的事实，所有我用过的其他iteratee库（如玩2.0的和约翰·米利金的enumerator Haskell的）有错误状态作为自己的一个Step类型的构造函数和Scalaz 7没有。

我目前的执行情况

这是我目前有。首先对部分进口和IO包装：

import java.io.{ BufferedReader, File, FileReader }
import scalaz._, Scalaz._, effect.IO, iteratee.{ Iteratee => I, _ }

def openFile(f: File) = IO(new BufferedReader(new FileReader(f)))
def readLine(r: BufferedReader) = IO(Option(r.readLine))
def closeReader(r: BufferedReader) = IO(r.close())

和类型别名清理了一点东西：

type ErrorOr[A] = Either[Throwable, A]

而现在tryIO帮手，在一个模型中（松散，也可能错误地） enumerator ：

def tryIO[A, B](action: IO[B]) = I.iterateeT[A, IO, ErrorOr[B]](
  action.catchLeft.map(
    r => I.sdone(r, r.fold(_ => I.eofInput, _ => I.emptyInput))
  )
)

对于一个枚举BufferedReader本身：

def enumBuffered(r: => BufferedReader) = new EnumeratorT[ErrorOr[String], IO] {
  lazy val reader = r
  def apply[A] = (s: StepT[ErrorOr[String], IO, A]) => s.mapCont(k =>
    tryIO(readLine(reader)) flatMap {
      case Right(None)       => s.pointI
      case Right(Some(line)) => k(I.elInput(Right(line))) >>== apply[A]
      case Left(e)           => k(I.elInput(Left(e)))
    }
  )
}

最后一个枚举这是负责打开和关闭的读者：

def enumFile(f: File) = new EnumeratorT[ErrorOr[String], IO] {
  def apply[A] = (s: StepT[ErrorOr[String], IO, A]) => s.mapCont(k =>
    tryIO(openFile(f)) flatMap {
      case Right(reader) => I.iterateeT(
        enumBuffered(reader).apply(s).value.ensuring(closeReader(reader))
      )
      case Left(e) => k(I.elInput(Left(e)))
    }
  )
}

现在假设例如，我想收集含有至少25文件中的所有行'0'字符转换成一个列表。我可以这样写：

val action: IO[ErrorOr[List[String]]] = (
  I.consume[ErrorOr[String], IO, List] %=
  I.filter(_.fold(_ => true, _.count(_ == '0') >= 25)) &=
  enumFile(new File("big.txt"))
).run.map(_.sequence)

在许多方面，这似乎是做工精美：我可以踢的动作了与unsafePerformIO ，它将块通过一两分钟，在不断的记忆和不吹堆栈几千万行和千兆字节的数据，然后关闭读者当它完成。如果我给它一个不存在的文件的名称，它会忠实地还给我包裹在一个异常Left和enumBuffered至少似乎做出相应的表现，如果它击中一个例外，而读。

潜在问题

我有我的执行一些担忧，虽然，特别是tryIO 。例如，假设我尝试撰写几iteratees：

val it = for {
  _ <- tryIO[Unit, Unit](IO(println("a")))
  _ <- tryIO[Unit, Unit](IO(throw new Exception("!")))
  r <- tryIO[Unit, Unit](IO(println("b")))
} yield r

如果我跑，我得到如下：

scala> it.run.unsafePerformIO()
a
b
res11: ErrorOr[Unit] = Right(())

如果我尝试同样的事情enumerator在GHCI，结果更像是我所期望的：

...> run $ tryIO (putStrLn "a") >> tryIO (error "!") >> tryIO (putStrLn "b")
a
Left !

我只是不明白的方式来获得这种行为没有在iteratee库本身的错误状态。

我的问题

我不宣称自己是任何种类iteratees专家，但我已经在几个项目中使用的各种Haskell的实现，感觉就像我或多或少理解基本概念，以及与奥列格有咖啡一次。我在这里损失，虽然。这是处理在没有错误状态的异常的合理方式是什么？是否有实施方式tryIO会表现得更像enumerator的版本？是否有某种定时炸弹在我的执行行为不同的事实，等我的？

Answer 1:

这里编辑是真正的解决方案。我留在原来的职位，因为我认为它值得看的格局。因为适合Klesli工程IterateeT

import java.io.{ BufferedReader, File, FileReader }
import scalaz._, Scalaz._, effect._, iteratee.{ Iteratee => I, _ }

object IterateeIOExample {
  type ErrorOr[+A] = EitherT[IO, Throwable, A]

  def openFile(f: File) = IO(new BufferedReader(new FileReader(f)))
  def readLine(r: BufferedReader) = IO(Option(r.readLine))
  def closeReader(r: BufferedReader) = IO(r.close())

  def tryIO[A, B](action: IO[B]) = I.iterateeT[A, ErrorOr, B] {
    EitherT.fromEither(action.catchLeft).map(r => I.sdone(r, I.emptyInput))
  }

  def enumBuffered(r: => BufferedReader) = new EnumeratorT[String, ErrorOr] {
    lazy val reader = r
    def apply[A] = (s: StepT[String, ErrorOr, A]) => s.mapCont(k =>
      tryIO(readLine(reader)) flatMap {
        case None => s.pointI
        case Some(line) => k(I.elInput(line)) >>== apply[A]
      })
  }

  def enumFile(f: File) = new EnumeratorT[String, ErrorOr] {
    def apply[A] = (s: StepT[String, ErrorOr, A]) => 
      tryIO(openFile(f)).flatMap(reader => I.iterateeT[String, ErrorOr, A](
        EitherT(
          enumBuffered(reader).apply(s).value.run.ensuring(closeReader(reader)))))
  }

  def main(args: Array[String]) {
    val action = (
      I.consume[String, ErrorOr, List] %=
      I.filter(a => a.count(_ == '0') >= 25) &=
      enumFile(new File(args(0)))).run.run

    println(action.unsafePerformIO().map(_.size))
  }
}

=====原始帖子=====

我觉得你需要在混音的EitherT。没有EitherT你是刚刚结束了一个3左派或权利。随着EitherT将propergate左侧。

我想你真正想要的是

type ErrorOr[+A] = EitherT[IO, Throwable, A] 
I.iterateeT[A, ErrorOr, B]

下面的代码模仿你如何撰写当前事情。由于IterateeT没有的左，右，当您撰写它，你只是一堆IO /标识的最终概念。

scala> Kleisli((a:Int) => 4.right[String].point[Id])
res11: scalaz.Kleisli[scalaz.Scalaz.Id,Int,scalaz.\/[String,Int]] = scalaz.KleisliFunctions$$anon$18@73e771ca

scala> Kleisli((a:Int) => "aa".left[Int].point[Id])
res12: scalaz.Kleisli[scalaz.Scalaz.Id,Int,scalaz.\/[String,Int]] = scalaz.KleisliFunctions$$anon$18@be41b41

scala> for { a <- res11; b <- res12 } yield (a,b)
res15: scalaz.Kleisli[scalaz.Scalaz.Id,Int,(scalaz.\/[String,Int], scalaz.\/[String,Int])] = scalaz.KleisliFunctions$$anon$18@42fd1445

scala> res15.run(1)
res16: (scalaz.\/[String,Int], scalaz.\/[String,Int]) = (\/-(4),-\/(aa))

在下面的代码，而不是使用身份证，我们使用了一个EitherT。由于EitherT具有相同的绑定行为，要么，我们最终得到我们想要的东西。

scala>  type ErrorOr[+A] = EitherT[Id, String, A]
defined type alias ErrorOr

scala> Kleisli[ErrorOr, Int, Int]((a:Int) => EitherT(4.right[String].point[Id]))
res22: scalaz.Kleisli[ErrorOr,Int,Int] = scalaz.KleisliFunctions$$anon$18@58b547a0

scala> Kleisli[ErrorOr, Int, Int]((a:Int) => EitherT("aa".left[Int].point[Id]))
res24: scalaz.Kleisli[ErrorOr,Int,Int] = scalaz.KleisliFunctions$$anon$18@342f2ceb

scala> for { a <- res22; b <- res24 } yield 2
res25: scalaz.Kleisli[ErrorOr,Int,Int] = scalaz.KleisliFunctions$$anon$18@204eab31

scala> res25.run(2).run
res26: scalaz.Scalaz.Id[scalaz.\/[String,Int]] = -\/(aa)

你可以用IterateeT和Id与IO取代Keisli得到你所需要的。

Answer 2:

顺便pipes做它是使用输入级组合物Channel类型的类：

class Channel p where
    {-| 'idT' acts like a \'T\'ransparent proxy, passing all requests further
        upstream, and passing all responses further downstream. -}
    idT :: (Monad m) => a' -> p a' a a' a m r

    {-| Compose two proxies, satisfying all requests from downstream with
        responses from upstream. -}
    (>->) :: (Monad m)
          => (b' -> p a' a b' b m r)
          -> (c' -> p b' b c' c m r)
          -> (c' -> p a' a c' c m r)
    p1 >-> p2 = p2 <-< p1

...和派生在抬起组合物EitherT从所述基础组合物。这是代理变压器，在引入的原理的特殊情况pipes-2.4 ，其允许起重组合物在任意的扩展。

该提升要求限定EitherT专门到的形状Proxy在类型Control.Proxy.Trans.Either ：

newtype EitherP e p a' a b' b (m :: * -> *) r
  = EitherP { runEitherP :: p a' a b' b m (Either e r) }

这种专业化的Proxy形状是必要的，以便能够定义的好类型的实例Channel类。斯卡拉可能会在这方面比Haskell的更加灵活。

然后，我只是重新定义Monad与所有普通沿着实例（及其它实例） EitherT这种特殊类型的操作：

throw :: (Monad (p a' a b' b m)) => e -> EitherP e p a' a b' b m r
throw = EitherP . return . Left

catch
 :: (Monad (p a' a b' b m))
 => EitherP e p a' a b' b m r        -- ^ Original computation
 -> (e -> EitherP f p a' a b' b m r) -- ^ Handler
 -> EitherP f p a' a b' b m r        -- ^ Handled computation
catch m f = EitherP $ do
    e <- runEitherP m
    runEitherP $ case e of
        Left  l -> f     l
        Right r -> right r

有了这个手我就可以定义如下举实例组成：

-- Given that 'p' is composable, so is 'EitherP e p'
instance (Channel p) => Channel (EitherP e p) where
    idT = EitherP . idT
    p1 >-> p2 = (EitherP .) $ runEitherP . p1 >-> runEitherP . p2

要了解什么是对那里发生的，只要按照类型：

p1 :: b' -> EitherP e p a' a b' b m r
p2 :: c' -> EitherP e p b' b c' c m r

runEitherP . p1 :: b' -> p a' a b' b m (Either e r)
runEitherP . p2 :: c' -> p b' b c' c m (Either e r)

-- Use the base composition for 'p'
runEitherP . p1 >-> runEitherP . p2
 :: c' -> p a' a c' c m (Either e r)

-- Rewrap in EitherP
(EitherP . ) $ runEitherP . p1 >-> runEitherP . p2
 :: c' -> EitherP e p a' a c' c m r

这可以让你扔，赶上一个特定阶段内的错误，而不会中断其它阶段。下面是我复制并粘贴到我的例子pipes-2.4发布后：

import Control.Monad (forever)
import Control.Monad.Trans (lift)
import Control.Proxy
import Control.Proxy.Trans.Either as E
import Safe (readMay)

promptInts :: () -> EitherP String Proxy C () () Int IO r
promptInts () = recover $ forever $ do
    str <- lift getLine
    case readMay str of
        Nothing -> E.throw "Could not parse an integer"
        Just n  -> liftP $ respond n

recover p =
    p `E.catch` (\str -> lift (putStrLn str) >> recover p)

main = runProxy $ runEitherK $ mapP printD <-< promptInts

这里的结果：

>>> main
1<Enter>
1
Test<Enter>
Could not parse an integer
Apple<Enter>
Could not parse an integer
5<Enter>
5

这个问题的答案的iteratee方法是相似的。你必须采取撰写iteratees您现有的方式，然后将其在EitherT 。无论你是否使用类型类或只是定义一个新的合成运算是由你。

其他一些有用的链接：

pipes-2.4发布后
Control.Proxy.Class ， Control.Proxy.Trans和Control.Proxy.Trans.Either
一个非常类似的堆栈溢出问题，关于同一主题（除`管道）

文章来源: Handling exceptions in an iteratee library without an error state

标签： scala haskell io scalaz iterate

我命由我不由天

女 | 书童

私信

收藏的人(0)

Ta的文章更多文章

0条评论

还没有人评论过~