I recently learned about Stream
s in Java 8 and saw this example:
IntStream stream = IntStream.range(1, 20);
Now, let's say that we want to find the first number that's divisable both by 3 and a 5. We'd probably filter
twice and findFirst
as follows:
OptionalInt result = stream.filter(x -> x % 3 == 0)
.filter(x -> x % 5 == 0)
.findFirst();
That's all sounds pretty reasonable. The surprising part came when I tried to do this:
OptionalInt result = stream.filter(x -> {System.out.println(x); return x % 3 == 0;})
.filter(x -> {System.out.println(x); return x % 5 == 0;})
.findFirst();
System.out.println(result.getAsInt());
I expected to get something like: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
and then: 3 6 9 12 15 18
. Because we first iterate over all the the numbers between 1 to 20, filter out only those that are divisable by 3 and then iterate this new Stream
and find those that are divisable by 5.
But instead I got this output: 1 2 3 3 4 5 6 6 7 8 9 9 10 11 12 12 13 14 15 15 15
It looks like it doesn't go over all the numbers. Moreover, it looks like it checks x % 5 == 0
only for those numbers that are divisable by 3.
I don't understand how come it doesn't iterate over all of the numbers.
Here's an online snippet of the code: http://www.tryjava8.com/app/snippets/5454a7f2e4b070922a64002b
Well, the thing to understand about streams is that, unlike lists, they don't (necessarily) hold all the items but rather compute each item at a time (lazy evaluation).
It means that when you did IntStream stream = IntStream.range(1, 20);
you didn't actually create a collection with 20 items. You created a dynamically computed collection. Each call to this stream's next
will compute the next item. The rest of the items are still "not there" (sort of speaking).
Same goes for the filter.
When you add the filter that's checking division by 3 you'll get a new stream that's combined of 2 computations - the first one returns the numbers from 1 until it gets to 20, the second computation returns the numbers that are divided by 3. It's important to understand that each time only the first item is calculated. That's why when you added the check for division by 5 it only worked on those items that were divisible by 3. Same goes as to why the printing stopped at 15. The findFirst
method returns the first number that passes all 3 computations (the 1-20 range computation, the division by 3 computation and the division by 5 computation).
A Stream is a lazy evaluation mechanism for processing Collections efficiently. This means that all the intermediate operations on the Stream are not evaluated unless necessary for the final (terminal) operation.
In your example, the terminal operation is firstFirst()
. This means the Stream will evaluate the pipeline of intermediate operations until it finds a single int that results from passing the input Stream through all the intermediate operations.
The second filter only receives ints that pass the first filter, so it only processes the numbers 3,6,9,12,15 and then stops, since 15 passes the filter, and supplies the findFirst()
operation the only output it needs.
The first filter will only process ints of the input Stream as long as the terminal operation still requires data, and therefore it will only process 1 to 15.