Let's say I have a Stream
of length 1,000,000
with all 1's.
scala> val million = Stream.fill(100000000)(1)
million: scala.collection.immutable.Stream[Int] = Stream(1, ?)
scala> million filter (x => x % 2 == 0)
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
I get an Out of Memory
exception.
Then, I tried the same filter
call with List
.
scala> val y = List.fill(1000000)(1)
y: List[Int] = List(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ...
scala> y.filter(x => x % 2 == 0)
res2: List[Int] = List()
Yet it succeeds.
Why does the Stream#filter
run out of memory here, but the List#filter
completes just fine?
Lastly, with a large stream, will filter
result in the non-lazy evaluation of the entire stream?
Overhead of
List
- single object (instance of::
) with 2 fields (2 pointers) per element.Overhead of
Stream
- instance ofCons
(with 3 pointers) plus an instance ofFunction
(tl: => Stream[A]
) for lazy evaluation ofStream#tail
per element.So you'll spend ~2 times more memory on
Stream
.You have defined your
Stream
asval
. Alternatively you could definemillion
asdef
- in this case afterfilter
GC will delete all created elements and you'll get your memory back.Note that only
tail
inStream
is lazy,head
is strict, sofilter
evaluates strictly until it gets first element that satisfies a given predicate, and since there is no such elements in yourStream
filter
iterates over all yourmillion
stream and puts all elements in memory.