Is Java 8 findFirst().isPresent() more efficient t

2019-04-06 09:39发布

问题:

Given I have a stream Stream<T> stream = list.stream().filter(some predicate) where the list is very large, is it more efficient to check whether the stream is non-empty by doing: stream.count() > 0 or by doing: stream.findFirst().isPresent()?

回答1:

If all you want to know, is whether there is a match, you should use
list.stream().anyMatch(some predicate), not only because it is more efficient, but also, because it is the correct idiom for expressing you intention.

As said by others, anyMatch is short-circuiting, which implies that it will stop at the first match, whereas count will, as the name suggests, count all matches before returning. Depending on the stream contents, this can make a huge performance difference. But note, that you can make count similarly efficient, by using list.stream().filter(some predicate).limit(1).count() > 0

Then, it will also stop after the first occurrence, but, as said, anyMatch is still the preferred way to express that you are interested in whether there is any match. Things change, when the task is to find out whether there are at least n matches. Then, .limit(n).count() > n-1 (or >= n) becomes the natural idiom.

Note that findFirst() differs from the other solution because its answer depends on the ordering. So if you only want to know whether there is a match, you should use findAny() instead. Still, there is a theoretical difference due to the requirement of returning the matching value, compared to just telling whether there is a match, like anyMatch does, though currently that difference lies only in the construction of an Optional instance, hence is negligible.

But since you are programming against an API to encode your intention, you should not use find… when you only want to know whether there is a match. anyMatch clearly expresses your intention and might have an even higher benefit in future implementations or more complex scenarios.



回答2:

I would recommend using list.stream().anyMatch(some predicate), which is a terminal operation for exactly this case. It's not only more efficient than stream.count(), but it won't hang on infinite streams.



回答3:

findAny (which is preferable to findFirst if you do not need ordering) and anyMatch are short-circuiting operations, which means they can return early without consuming the entire stream if the conditions allow. This is mentioned and linked in their method javadocs. count() is not.

If the stream's final stage still uses a spliterator with the SIZED characteristic then count() may be as fast as the other two options. But this is a far weaker property than short-circuiting since intermediate stream operations – such as filter() – are very likely to discard the SIZED aspect.

All this information can be gleaned from the package documentation, it's highly recommended reading.