Java8 stream.reduce() with 3 parameters - getting

2019-01-19 00:46发布

I wrote this code to reduce a list of words to a long count of how many words start with an 'A'. I'm just writing it to learn Java 8, so I'd like to understand it a little better [Disclaimer: I realize this is probably not the best way to write this code; it's just for practice!].

Long countOfAWords = results.stream().reduce(
    0L,
    (a, b) -> b.charAt(0) == 'A' ? a + 1 : a,
    Long::sum);

The middle parameter/lambda (called the accumulator) would seem to be capable of reducing the full list without the final 'Combiner' parameter. In fact, the Javadoc actually says:

The {@code accumulator} function acts as a fused mapper and accumulator, * which can sometimes be more efficient than separate mapping and reduction, * such as when knowing the previously reduced value allows you to avoid * some computation.

[Edit From Author] - The following statement is wrong, so don't let it confuse you; I'm just keeping it here so I don't ruin the original context of the answers.

Anyway, I can infer that the accumulator must just be outputting 1's and 0's which the combiner combines. I didn't find this particularly obvious from the documentation though.

My Question

Is there a way to see what the output would be before the combiner executes so I can see the list of 1's and 0's that the combiner combines? This would be helpful in debugging more complex situations which I'm sure I'll come across eventually.

2条回答
Fickle 薄情
2楼-- · 2019-01-19 01:25

The combiner does not reduce a list of 0's and 1's. When the stream is not run in parallel it's not used in this case so that the following loop is equivalent:

U result = identity;
for (T element : this stream)
    result = accumulator.apply(result, element)
return result;

When you run the stream in parallel, the task is spanned into multiple threads. So for example the data in the pipeline is partitioned into chunks that evaluate and produce a result independently. Then the combiner is used to merge this results.

So you won't see a list that is reduced, but rather 2 values either the identity value or with another value computed by a task that are summed. For example if you add a print statement in the combiner

(i1, i2) -> {System.out.println("Merging: "+i1+"-"+i2); return i1+i2;}); 

you could see something like this:

Merging: 0-0
Merging: 0-0
Merging: 1-0
Merging: 1-0
Merging: 1-1

This would be helpful in debugging more complex situations which I'm sure I'll come across eventaully.

More generally if you want to see the data on the pipeline on the go you can use peek (or the debugger could also help). So applied to your example:

long countOfAWords = result.stream().map(s -> s.charAt(0) == 'A' ? 1 : 0).peek(System.out::print).mapToLong(l -> l).sum();

which can output:

100100

[Disclaimer: I realize this is probably not the best way to write this code; it's just for practice!].

The idiomatic way to achieve your task would be to filter the stream and then simply use count:

long countOfAWords = result.stream().filter(s -> s.charAt(0) == 'A').count();

Hope it helps! :)

查看更多
来,给爷笑一个
3楼-- · 2019-01-19 01:48

One way to see what's going on is to replace the method reference Long::sum by a lambda that includes a println.

List<String> results = Arrays.asList("A", "B", "A", "A", "C", "A", "A");
Long countOfAWords = results.stream().reduce(
        0L,
        (a, b) -> b.charAt(0) == 'A' ? a + 1 : a,
        (a, b) -> {
            System.out.println(a + " " + b);
            return Long.sum(a, b);
        });

In this case, we can see that the combiner is not actually used. This is because the stream is not parallel. All we are really doing is using the accumulator to successively combine each String with the current Long result; no two Long values are ever combined.

If you replace stream by parallelStream you can see that the combiner is used and look at the values it combines.

查看更多
登录 后发表回答