I wrote this code to reduce a list of words to a long count of how many words start with an 'A'. I'm just writing it to learn Java 8, so I'd like to understand it a little better [Disclaimer: I realize this is probably not the best way to write this code; it's just for practice!].
Long countOfAWords = results.stream().reduce(
0L,
(a, b) -> b.charAt(0) == 'A' ? a + 1 : a,
Long::sum);
The middle parameter/lambda (called the accumulator) would seem to be capable of reducing the full list without the final 'Combiner' parameter. In fact, the Javadoc actually says:
The {@code accumulator} function acts as a fused mapper and accumulator, * which can sometimes be more efficient than separate mapping and reduction, * such as when knowing the previously reduced value allows you to avoid * some computation.
[Edit From Author] - The following statement is wrong, so don't let it confuse you; I'm just keeping it here so I don't ruin the original context of the answers.
Anyway, I can infer that the accumulator must just be outputting 1's and 0's which the combiner combines. I didn't find this particularly obvious from the documentation though.
My Question
Is there a way to see what the output would be before the combiner executes so I can see the list of 1's and 0's that the combiner combines? This would be helpful in debugging more complex situations which I'm sure I'll come across eventually.
The combiner does not reduce a list of 0's and 1's. When the stream is not run in parallel it's not used in this case so that the following loop is equivalent:
When you run the stream in parallel, the task is spanned into multiple threads. So for example the data in the pipeline is partitioned into chunks that evaluate and produce a result independently. Then the combiner is used to merge this results.
So you won't see a list that is reduced, but rather 2 values either the identity value or with another value computed by a task that are summed. For example if you add a print statement in the combiner
you could see something like this:
More generally if you want to see the data on the pipeline on the go you can use
peek
(or the debugger could also help). So applied to your example:which can output:
The idiomatic way to achieve your task would be to
filter
the stream and then simply usecount
:Hope it helps! :)
One way to see what's going on is to replace the method reference
Long::sum
by a lambda that includes aprintln
.In this case, we can see that the combiner is not actually used. This is because the stream is not parallel. All we are really doing is using the accumulator to successively combine each
String
with the currentLong
result; no twoLong
values are ever combined.If you replace
stream
byparallelStream
you can see that the combiner is used and look at the values it combines.