I sometimes find myself in a situation where I have some Stream[X]
, and a function X => Future Y
, that I'd like to combine to a Future[Stream[Y]]
, and I can't seem to find a way to do it. For example, I have
val x = (1 until 10).toStream
def toFutureString(value : Integer) = Future(value toString)
val result : Future[Stream[String]] = ???
I tried
val result = Future.Traverse(x, toFutureString)
which gives the correct result, but seems to consume the entire stream before returning the Future, which more or less defeats the purpse
I tried
val result = x.flatMap(toFutureString)
but that doesn't compile with type mismatch; found : scala.concurrent.Future[String] required: scala.collection.GenTraversableOnce[?]
val result = x.map(toFutureString)
returns the somewhat odd and useless Stream[Future[String]]
What should I do here to get things fixed?
Edit: I'm not stuck on a Stream
, I'd be equally happy with the same operation on an Iterator
, as long as it won't block on evaluating all items before starting to process the head
Edit2: I'm not 100% sure that the Future.Traverse construct needs to traverse the entire stream before returning a Future[Stream], but I think it does. If it doesn't, that's a fine answer in itself.
Edit3: I also don't need the result to be in order, I'm fine with the stream or iterator returned being whatever order.
The accepted answer is no longer valid as the modern version of Scalaz
traverse()
behaves differently and tries to consume the entire stream on the invocation time.As to the question I would say that it's impossible to achieve this a truly non-blocking fashion.
Future[Stream[Y]]
cannot be resolved untilStream[Y]
is available. And sinceY
is produced asynchronously by the functionX => Future[Y]
you cannot getY
without blocking on the time when you traverseStream[Y]
. That means that either all theFuture[Y]
must be resolved before resolvingFuture[Stream[Y]]
(which requires consuming the entire stream), or you must allow blocks to occur while traversingStream[Y]
(on items whose underlying futures aren't completed yet). But if we allow for blocking on the traversing then what would be the definition of the completion of the resulting future? From that perspective it could be the same asFuture.successful(BlockingStream[Y])
. That's in turn semantically equal to the originalStream[Future[Y]]
.In other words, I think there is an issue in the question itself.
You're on the right track with
traverse
, but unfortunately it looks like the standard library's definition is a little broken in this case—it shouldn't need to consume the stream before returning.Future.traverse
is a specific version of a much more general function that works on any applicative functor wrapped in a "traversable" type (see these papers or my answer here for more information, for example).The Scalaz library provides this more general version, and it works as expected in this case (note that I'm getting the applicative functor instance for
Future
fromscalaz-contrib
; it's not yet in the stable versions of Scalaz, which are still cross-built against Scala 2.9.2, which doesn't have thisFuture
):This returns immediately on an infinite stream, so we know for sure that it's not being consuming first.
As a footnote: If you look at the source for
Future.traverse
you'll see that it's implemented in terms offoldLeft
, which is convenient, but not necessary or appropriate in the case of streams.Forgetting about Stream:
yields (on my 8 core box):
So 8 of them get kicked off immediately (one for each core, though that's configurable via the threadpool executor), and then as those complete more are kicked off. The Future[List[String]] returns immediately, and then after a pause it starts printing those "completed x" messages.
An example use of this could be when you have a List[Url's], and a function of type Url => Future[HttpResponseBody]. You could call Future.traverse on that list with that function, and kick off those http requests in parallel, getting back a single future that's a List of the results.
Was something that like what you were going for?