Suppose I have
val foo : Seq[Double] = ...
val bar : Seq[Double] = ...
and I wish to produce a seq where the baz(i) = foo(i) + bar(i). One way I can think of to do this is
val baz : Seq[Double] = (foo.toList zip bar.toList) map ((f: Double, b : Double) => f+b)
However, this feels both ugly and inefficient -- I have to convert both seqs to lists (which explodes with lazy lists), create this temporary list of tuples, only to map over it and let it be GCed. Maybe streams solve the lazy problem, but in any case, this feels like unnecessarily ugly. In lisp, the map function would map over multiple sequences. I would write
(mapcar (lambda (f b) (+ f b)) foo bar)
And no temporary lists would get created anywhere. Is there a map-over-multiple-lists function in Scala, or is zip combined with destructuring really the 'right' way to do this?
The function you want is called
zipWith
, but it isn't a part of the standard library. It will be in 2.8 (UPDATE: Apparently not, see comments).See this Trac ticket.
UPDATE: It has been pointed out (in comments) that this "answer" doesn't actually address the question being asked. This answer will map over every combination of
foo
andbar
, producing N x M elements, instead of the min(M, N) as requested. So, this is wrong, but left for posterity since it's good information.The best way to do this is with
flatMap
combined withmap
. Code speaks louder than words:This will produce a single
Seq[Double]
, exactly as you would expect. This pattern is so common that Scala actually includes some syntactic magic which implements it:Or, alternatively:
The
for { ... }
syntax is really the most idiomatic way to do this. You can continue to add generator clauses (e.g.b <- bar
) as necessary. Thus, if it suddenly becomes threeSeq
s that you must map over, you can easily scale your syntax along with your requirements (to coin a phrase).When faced a similar task, I added the following pimp to
Iterable
s:Having this, one can do something like:
Notice that you should carefully consider collection type passed as nested
Iterable
, sincetail
andhead
will be recurrently called on it. So, ideally you should passIterable[List]
or other collection with fasttail
andhead
.Also, this code expects nested collections of the same size. That was my use case, but I suspect this can be improved, if needed.
In Scala 2.8:
And it works for more than two operands in the same way. I.e. you could then follow this up with:
A lazy list isn't a copy of a list - it's more like a single object. In the case of a lazy zip implementation, each time it is asked for the next item, it grabs an item from each of the two input lists and creates a tuple from them, and you then break the tuple apart with the pattern-matching in your lambda.
So there's never a need to create a complete copy of the whole input list(s) before starting to operate on them. It boils down to a very similar allocation pattern to any application running on the JVM - lots of very short-lived but small allocations, which the JVM is optimised to deal with.
Update: to be clear, you need to be using Streams (lazy lists) not Lists. Scala's streams have a zip that works the lazy way, and so you shouldn't be converting things into lists.
Ideally your algorithm should be capable of working on two infinite streams without blowing up (assuming it doesn't do any
folding
, of course, but just reads and generates streams).Well, that, the lack of zip, is a deficiency in Scala's 2.7 Seq. Scala 2.8 has a well-thought collection design, to replace the ad-hoc way the collections present in 2.7 came to be (note that they weren't all created at once, with an unified design).
Now, when you want to avoid creating temporary collection, you should use "projection" on Scala 2.7, or "view" on Scala 2.8. This will give you a collection type for which certain instructions, particularly map, flatMap and filter, are non-strict. On Scala 2.7, the projection of a List is a Stream. On Scala 2.8, there is a SequenceView of a Sequence, but there is a zipWith right there in the Sequence, you wouldn't even need it.
Having said that, as mentioned, JVM is optimized to handle temporary object allocations, and, when running in server mode, the run-time optimization can do wonders. So, do not optimize prematurely. Test the code in the conditions it will be run -- and if you haven't planned to run it in server mode, then rethink that if the code is expected to be long-running, and optmize when/where/if necessary.
EDIT
What is actually going to be available on Scala 2.8 is this: