I am trying to calculate the sum of squares of values in the list.
Below are three variations which all calculates the required value.
I want to know which one is the most efficient. I am expecting the third one
to be more efficient as auto boxing is only done once.
// sum of squares
int sum = list.stream().map(x -> x * x).reduce((x, y) -> x + y).get();
System.out.println("sum of squares: " + sum);
sum = list.stream().mapToInt(x -> x * x).sum();
System.out.println("sum of squares: " + sum);
sum = list.stream().mapToInt(x -> x).map(x -> x * x).sum();
System.out.println("sum of squares: " + sum);
When in doubt, test! Using jmh, I get the following results on a list of 100k elements (in microseconds, smaller is better):
Benchmark Mode Samples Score Error Units
c.a.p.SO32462798.for_loop avgt 10 119.110 0.921 us/op
c.a.p.SO32462798.mapToInt avgt 10 129.702 1.040 us/op
c.a.p.SO32462798.mapToInt_map avgt 10 129.753 1.516 us/op
c.a.p.SO32462798.map_reduce avgt 10 1262.802 12.197 us/op
c.a.p.SO32462798.summingInt avgt 10 134.821 1.203 us/op
So you have, from faster to slower:
for(int i : list) sum += i*i;
mapToInt(x -> x * x).sum()
and mapToInt(x -> x).map(x -> x * x).sum()
collect(Collectors.summingInt(x -> x * x))
map(x -> x * x).reduce((x, y) -> x + y).get()
Note that the results are very much dependent on the JIT optimisations. If the logic in the mapping is more complex, some of the optimisations may be unavailable (longer code = less inlining) in which case the streams versions may take 4-5x more time than the for loop - but if that logic is CPU heavy the difference will reduce again. Profiling your actual application will give you more information.
Benchmark code for reference:
@State(Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
public class SO32462798 {
List<Integer> list;
@Setup public void setup() {
list = new Random().ints(100_000).boxed().collect(toList());
}
@Benchmark public int for_loop() {
int sum = 0;
for (int i : list) sum += i * i;
return sum;
}
@Benchmark public int summingInt() {
return list.stream().collect(Collectors.summingInt(x -> x * x));
}
@Benchmark public int mapToInt() {
return list.stream().mapToInt(x -> x * x).sum();
}
@Benchmark public int mapToInt_map() {
return list.stream().mapToInt(x -> x).map(x -> x * x).sum();
}
@Benchmark public int map_reduce() {
return list.stream().map(x -> x * x).reduce((x, y) -> x + y).get();
}
}
I expect the second one to be the fastest.
There is boxing in neither the second nor the third example (if the list contains already boxed elements). But, there is unboxing.
Your second example might have two unboxing (one for every x
in x*x
), while the third does unboxing only once. However, unboxing is fast and I think it does not worth to optimize that as a longer pipeline with an additional function call will certainly slow it down.
Sidenote: in general, you should not expect Stream
s to be faster than regular iterations on arrays or lists. When doing mathematical calculations where speed matters (like this) it's better to go the other way: simply iterate through the elements. If your output is an aggregated value, then aggregate it, if it is a mapping, then allocate a new array or list of the same size and fill it with the calculated values.