Today, I want to share something that was blowing my mind when I tried to implement this simple operation:
I found different ways to perform the same operation:
- By using the
std::inner_product
. - Implementing a predicate and using the
std::accumulate
function. - Using a loop in C style.
I wanted to perform some benchmark by using Quick Bench and enabling all the optimizations.
First of all, I compared the two C++ alternatives with floating values. This is the code used by using std::accumulate
:
const auto predicate = [](const double previous, const double current) {
return previous + current * current;
};
const auto result = std::accumulate(input.cbegin(), input.cend(), 0, predicate);
Versus this code by using the std::inner_product
functionality:
const auto result = std::inner_product(input.cbegin(), input.cend(), input.cbegin(), 1);
After running the benchmark with all the optimization enabled, I got this result:
Both algorithms seem to reach the same performance. I did want to go further and try the C implementation:
double result = 0;
for (auto i = 0; i < input.size(); ++i) {
result += input[i] * input[i];
}
And surprisingly, I found:
I was not expecting this result. I was sure there is something wrong so I did check the GCC implementation:
template<typename _InputIterator1, typename _InputIterator2, typename _Tp>
inline _Tp
inner_product(_InputIterator1 __first1, _InputIterator1 __last1,
_InputIterator2 __first2, _Tp __init)
{
// concept requirements
__glibcxx_function_requires(_InputIteratorConcept<_InputIterator1>)
__glibcxx_function_requires(_InputIteratorConcept<_InputIterator2>)
__glibcxx_requires_valid_range(__first1, __last1);
for (; __first1 != __last1; ++__first1, (void)++__first2)
__init = __init + (*__first1 * *__first2);
return __init;
}
I found that It was doing the same as the C implementation. After reviewing the implementation, I discovered something weird, (or at least I was not expecting to have that significant impact): in all the internal accumulations, it was doing a cast from the iterator value_type to the type of the initial value.
In my case, I was initializing the initial values to 0 or 1, the values were considered integers and in each accumulation, the compiler was doing the casting. In the different test cases, my input array stores truncated floating points, so the result did not change.
After updating the initial value to a double type:
const auto result = std::accumulate(input.cbegin(), input.cend(), 0.0, predicate);
And:
const auto result = std::inner_product(input.cbegin(), input.cend(), input.cbegin(), 0.0);
I got the expected result:
Now, I understand that leaving the initial value to be an independent type from the underlying type of the iterator may make the function more flexible and allow to do more things. But,
If I am accumulating elements of an array, I am expecting to get the same type as a result. Same for the inner product.
Should it be the default behaviour?
Why did the standard decide to perform it in this way?
Your expectation is wrong (though it is not quite clear what "same type as result" means), as you can clearly see from std::accumulate documentation:
return type is exactly the same type you use for initial value. The same effect you can have on the loop:
Because this way you can decide what type you use to aggregate. Note
std::accumulate
can be used for left fold and cases whenT
not equal tostd::iterator_traits<InputIt>::value_type
not less often (probably even more) than when they match.