Java 8 streams allow code that is a lot more readable than old-fashioned for
loops, in most cases. However, based on my own experience and what I've read, using a stream instead of a for loop can involve a performance hit (or occasionally an improvement) which is sometimes difficult to predict.
In a large project it doesn't seem feasible to write a benchmark test for every loop, so when deciding whether to replace a for
loop with a stream, what are the key factors (e.g. expected size of the collection, expected percentage of values removed by filtering, complexity of iterative operations, the type of reduction or aggregation, etc.) which give a likely indication of the performance change that will result?
Note: this is a narrowing of my earlier question, which was closed for being too broad (and for which the aspects of parallel streams were pretty well covered in another SO question), so let's just limit this to sequential streams.
It’s not only “not feasible to write a benchmark test for every loop”, it’s counter productive. A particular, application specific loop may perform entirely different when being put into a micro-benchmark.
For an actual application, the standard rule of optimization applies: don’t do it. Just write whatever is more readable and only if there is a performance problem, profile the entire application to check whether a particular loop or stream use really is the bottleneck. Only if this is the case, you may try to switch between both idioms at the particular bottleneck to see whether it makes a difference.
In most cases, it won’t. If there is a real performance issue, it will stem from the type of operation, e.g. performing a nested iteration with an O(n²)
time complexity, etc. Such problems do not dependent on whether you use a Stream
or a for
loop and the minor performance differences between these two idioms don’t change how your code scales.
There aren't big general speed differences between streams
and loops
, their advantages/disadvantages are problem-specific. Whether you choose one or the other should depend (mostly) on the readability the code. For some performance comparisons, see Benchmark1 and Benchmark2 where you can notice Brian Goetz's comment to one of the answers:
Your conclusion about performance, while valid, is overblown. There are plenty of cases where the stream code is faster than the iterative code, largely because per-element access costs is cheaper with streams than with plain iterators. And in many cases, the streams version inlines to something that is equivalent to the hand-written version. Of course, the devil is in the details; any given bit of code might behave differently.
Apart from that, just make sure that when you benchmark you use the JMH.