scala return on first Some in list

2019-03-15 10:41发布

问题:

I have a list l:List[T1] and currently im doing the following:

myfun : T1 -> Option[T2]
val x: Option[T2] = l.map{ myfun(l) }.flatten.find(_=>true)

The myfun function returns None or Some, flatten throws away all the None's and find returns the first element of the list if any.

This seems a bit hacky to me. Im thinking that there might exist some for-comprehension or similar that will do this a bit less wasteful or more clever. For example: I dont need any subsequent answers if myfun returns any Some during the map of the list l.

回答1:

How about:

l.toStream flatMap (myfun andThen (_.toList)) headOption

Stream is lazy, so it won't map everything in advance, but it won't remap things either. Instead of flattening things, convert Option to List so that flatMap can be used.



回答2:

Well, this is almost, but not quite

val x = (l flatMap myfun).headOption

But you are returning a Option rather than a List from myfun, so this may not work. If so (I've no REPL to hand) then try instead:

val x = (l flatMap(myfun(_).toList)).headOption


回答3:

Well, the for-comprehension equivalent is pretty easy

(for(x<-l, y<-myfun(x)) yield y).headOption

which, if you actually do the the translation works out the same as what oxbow_lakes gave. Assuming reasonable laziness of List.flatmap, this is both a clean and efficient solution.



回答4:

In addition to using toStream to make the search lazy, we can use Stream::collectFirst:

List(1, 2, 3, 4, 5, 6, 7, 8).toStream.map(myfun).collectFirst { case Some(d) => d }
// Option[String] = Some(hello)
// given def function(i: Int): Option[String] = if (i == 5) Some("hello") else None

This:

  • Transforms the List into a Stream in order to stop the search early.

  • Transforms elements using myFun as Option[T]s.

  • Collects the first mapped element which is not None and extract it.

Starting Scala 2.13, with the deprecation of Streams in favor of LazyLists, this would become:

List(1, 2, 3, 4, 5, 6, 7, 8).to(LazyList).map(function).collectFirst { case Some(d) => d }


回答5:

As of 2017, the previous answers seem to be outdated. I ran some benchmarks (list of 10 million Ints, first match roughly in the middle, Scala 2.12.3, Java 1.8.0, 1.8 GHz Intel Core i5). Unless otherwise noted, list and map have the following types:

list: scala.collection.immutable.List
map: A => Option[B]

Simply call map on the list: ~1000 ms

list.map(map).find(_.isDefined).flatten

First call toStream on the list: ~1200 ms

list.toStream.map(map).find(_.isDefined).flatten

Call toStream.flatMap on the list: ~450 ms

list.toStream.flatMap(map(_).toList).headOption

Call flatMap on the list: ~100 ms

list.flatMap(map(_).toList).headOption

First call iterator on the list: ~35 ms

list.iterator.map(map).find(_.isDefined).flatten

Recursive function find(): ~25 ms

def find[A,B](list: scala.collection.immutable.List[A], map: A => Option[B]) : Option[B] = {
  list match {
    case Nil => None
    case head::tail => map(head) match {
      case None => find(tail, map)
      case result @ Some(_) => result
    }
  }
}

Iterative function find(): ~25 ms

def find[A,B](list: scala.collection.immutable.List[A], map: A => Option[B]) : Option[B] = {
  for (elem <- list) {
    val result = map(elem)
    if (result.isDefined) return result
  }
  return None
}

You can further speed up things by using Java instead of Scala collections and a less functional style.

Loop over indices in java.util.ArrayList: ~15 ms

def find[A,B](list: java.util.ArrayList[A], map: A => Option[B]) : Option[B] = {
  var i = 0
  while (i < list.size()) {
    val result = map(list.get(i))
    if (result.isDefined) return result
    i += 1
  }
  return None
}

Loop over indices in java.util.ArrayList with function returning null instead of None: ~10 ms

def find[A,B](list: java.util.ArrayList[A], map: A => B) : Option[B] = {
  var i = 0
  while (i < list.size()) {
    val result = map(list.get(i))
    if (result != null) return Some(result)
    i += 1
  }
  return None
}

(Of course, one would usually declare the parameter type as java.util.List, not java.util.ArrayList. I chose the latter here because it's the class I used for the benchmarks. Other implementations of java.util.List will show different performance - most will be worse.)