How do you stop building an Option[Collection] upo

2019-01-25 13:31发布

问题:

When building up a collection inside an Option, each attempt to make the next member of the collection might fail, making the collection as a whole a failure, too. Upon the first failure to make a member, I'd like to give up immediately and return None for the whole collection. What is an idiomatic way to do this in Scala?

Here's one approach I've come up with:

def findPartByName(name: String): Option[Part] = . . .

def allParts(names: Seq[String]): Option[Seq[Part]] =
  names.foldLeft(Some(Seq.empty): Option[Seq[Part]]) {
    (result, name) => result match {
      case Some(parts) =>
        findPartByName(name) flatMap { part => Some(parts :+ part) }
      case None => None
    }
  }

In other words, if any call to findPartByName returns None, allParts returns None. Otherwise, allParts returns a Some containing a collection of Parts, all of which are guaranteed to be valid. An empty collection is OK.

The above has the advantage that it stops calling findPartByName after the first failure. But the foldLeft still iterates once for each name, regardless.

Here's a version that bails out as soon as findPartByName returns a None:

def allParts2(names: Seq[String]): Option[Seq[Part]] = Some(
  for (name <- names) yield findPartByName(name) match {
    case Some(part) => part
    case None => return None
  }
)

I currently find the second version more readable, but (a) what seems most readable is likely to change as I get more experience with Scala, (b) I get the impression that early return is frowned upon in Scala, and (c) neither one seems to make what's going on especially obvious to me.

The combination of "all-or-nothing" and "give up on the first failure" seems like such a basic programming concept, I figure there must be a common Scala or functional idiom to express it.

回答1:

The return in your code is actually a couple levels deep in anonymous functions. As a result, it must be implemented by throwing an exception which is caught in the outer function. This isn't efficient or pretty, hence the frowning.

It is easiest and most efficient to write this with a while loop and an Iterator.

def allParts3(names: Seq[String]): Option[Seq[Part]] = {
  val iterator = names.iterator
  var accum = List.empty[Part]
  while (iterator.hasNext) {
    findPartByName(iterator.next) match {
      case Some(part) => accum +:= part
      case None => return None
    }
  }
  Some(accum.reverse)
}

Because we don't know what kind of Seq names is, we must create an iterator to loop over it efficiently rather than using tail or indexes. The while loop can be replaced with a tail-recursive inner function, but with the iterator a while loop is clearer.



回答2:

Scala collections have some options to use laziness to achieve that.

You can use view and takeWhile:

def allPartsWithView(names: Seq[String]): Option[Seq[Part]] = {
    val successes = names.view.map(findPartByName)
                              .takeWhile(!_.isEmpty)
                              .map(_.get)
                              .force
    if (!names.isDefinedAt(successes.size)) Some(successes)
    else None
}

Using ifDefinedAt avoids potentially traversing a long input names in the case of an early failure.

You could also use toStream and span to achieve the same thing:

def allPartsWithStream(names: Seq[String]): Option[Seq[Part]] = {
    val (good, bad) = names.toStream.map(findPartByName)
                                    .span(!_.isEmpty)
    if (bad.isEmpty) Some(good.map(_.get).toList)
    else None
}

I've found trying to mix view and span causes findPartByName to be evaluated twice per item in case of success.

The whole idea of returning an error condition if any error occurs does, however, sound more like a job ("the" job?) for throwing and catching exceptions. I suppose it depends on the context in your program.



回答3:

Combining the other answers, i.e., a mutable flag with the map and takeWhile we love.

Given an infinite stream:

scala> var count = 0
count: Int = 0

scala> val vs = Stream continually { println(s"Compute $count") ; count += 1 ; count }
Compute 0
vs: scala.collection.immutable.Stream[Int] = Stream(1, ?)

Take until a predicate fails:

scala> var failed = false
failed: Boolean = false

scala> vs map { case x if x < 5 => println(s"Yup $x"); Some(x) case x => println(s"Nope $x"); failed = true; None } takeWhile (_.nonEmpty) map (_.get)
Yup 1
res0: scala.collection.immutable.Stream[Int] = Stream(1, ?)

scala> .toList
Compute 1
Yup 2
Compute 2
Yup 3
Compute 3
Yup 4
Compute 4
Nope 5
res1: List[Int] = List(1, 2, 3, 4)

or more simply:

scala> var count = 0
count: Int = 0

scala> val vs = Stream continually { println(s"Compute $count") ; count += 1 ; count }
Compute 0
vs: scala.collection.immutable.Stream[Int] = Stream(1, ?)

scala> var failed = false
failed: Boolean = false

scala> vs map { case x if x < 5 => println(s"Yup $x"); x case x => println(s"Nope $x"); failed = true; -1 } takeWhile (_ => !failed)
Yup 1
res3: scala.collection.immutable.Stream[Int] = Stream(1, ?)

scala> .toList
Compute 1
Yup 2
Compute 2
Yup 3
Compute 3
Yup 4
Compute 4
Nope 5
res4: List[Int] = List(1, 2, 3, 4)


回答4:

I think your allParts2 function has a problem as one of the two branches of your match statement will perform a side effect. The return statement is the not-idiomatic bit, behaving as if you are doing an imperative jump.

The first function looks better, but if you are concerned with the sub-optimal iteration that foldLeft could produce you should probably go for a recursive solution as the following:

def allParts(names: Seq[String]): Option[Seq[Part]] = {
  @tailrec
  def allPartsRec(names: Seq[String], acc: Seq[String]): Option[Seq[String]] = names match {
    case Seq(x, xs@_*) => findPartByName(x) match {
      case Some(part) => allPartsRec(xs, acc +: part)
      case None => None
    }
    case _ => Some(acc)
  }

  allPartsRec(names, Seq.empty)
}

I didn't compile/run it but the idea should be there and I believe it is more idiomatic than using the return trick!



回答5:

I keep thinking that this has to be a one- or two-liner. I came up with one:

def allParts4(names: Seq[String]): Option[Seq[Part]] = Some(
  names.map(findPartByName(_) getOrElse { return None })
)

Advantage:

  • The intent is extremely clear. There's no clutter and there's no exotic or nonstandard Scala.

Disadvantages:

  • The early return violates referential transparency, as Aldo Stracquadanio pointed out. You can't put the body of allParts4 into its calling code without changing its meaning.

  • Possibly inefficient due to the internal throwing and catching of an exception, as wingedsubmariner pointed out.

Sure enough, I put this into some real code, and within ten minutes, I'd enclosed the expression inside something else, and predictably got surprising behavior. So now I understand a little better why early return is frowned upon.

This is such a common operation, so important in code that makes heavy use of Option, and Scala is normally so good at combining things, I can't believe there isn't a pretty natural idiom to do it correctly.

Aren't monads good for specifying how to combine actions? Is there a GiveUpAtTheFirstSignOfResistance monad?