Strange behavior of Stream.spliterator for paralle

2019-02-16 17:48发布

问题:

I'm using the stream spliterator directly for the low-level operations in the library I'm writing. Recently I discovered very weird behavior when I take the stream spliterator and interleave tryAdvance/trySplit calls. Here's a simple code which demonstrates the problem:

import java.util.Arrays;
import java.util.Spliterator;

public class SpliteratorBug {
    public static void main(String[] args) {
        Integer[][] input = { { 1 }, { 2, 3 }, { 4, 5, 6 }, { 7, 8 }, { 9 } };
        Spliterator<Integer> spliterator = Arrays.stream(input).parallel()
                .flatMap(Arrays::stream).spliterator();
        spliterator.trySplit();
        spliterator.tryAdvance(s -> {});
        spliterator.trySplit();
        spliterator.forEachRemaining(System.out::println);
    }
}

The output is

5
6
9

As you can see, after flat-mapping I should get the ordered stream of consecutive numbers from 1 to 9. I split the spliterator once, so it should jump to some intermediate location. Next I consume an element from it and split it one more time. After that I print all the remaining elements. I expect that I will have several consecutive elements from the stream tail (probably zero elements, it would also be fine). However what I get is 5 and 6, then sudden jump to 9.

I know that currently in JDK spliterators are not used this way: they always split before the traversal. However official documentation does not explicitly forbid to call the trySplit after tryAdvance.

The problem was never observed when I use spliterator created directly from collection, array, generated source, etc. It's observed only if the spliterator was created from the parallel stream which had the intermediate flatMap.

So the question is: did I hit the bug or it's explicitly forbidden somewhere to use the spliterator in this way?

回答1:

From what I can see from the source of AbstractWrappingSpliterator and company, when you tryAdvance, the output of flatMap (4,5,6) gets buffered and then 4 gets consumed leaving (5,6) in the buffer. Then trySplit correctly splits off (7,8) to the new Spliterator leaving 9 in old one but the buffered (5,6) stay with the old Spliterator.

So this looks like a bug to me. It should either hand the buffer off to the new Spliterator or return null and refuse to split if the buffer is not empty.



回答2:

From the documentation of Spliterator.trySplit():

This method may return null for any reason, including emptiness, inability to split after traversal has commenced, data structure constraints, and efficiency considerations.

(emphasis mine)

So the documentation explicitly mentions the possibility to attempt splitting after commencing traversal and suggests that spliterators which are unable to handle this may return null.

So for ordered spliterators, the observed behavior should considered a bug as described by Misha. Generally, the fact that trySplit() has to return a prefix spliterator, in other words, has to hand over all intermediate state regarding the next items to the new spliterator, is a peculiarity of the Spliterator API that makes bugs likely. I took this question as a motive for checking my own spliterator implementations and found a similar bug…



回答3:

This behavior was officially recognized as a bug (see JDK-8148838), fixed by me and pushed into JDK-9 trunk (see changeset). The sad thing is that my initial patch actually fixed the splitting after flatMap (see webrev), but this patch was declined as such scenario (using trySplit() after tryAdvance()) was considered as uncommon and discouraged. The currently accepted solution is to disable the WrappingSpliterator splitting after advance at all which is enough to fix the problem.