Difference in performance: std::accumulate vs std:

Today, I want to share something that was blowing my mind when I tried to implement this simple operation:

I found different ways to perform the same operation:

By using the std::inner_product.
Implementing a predicate and using the std::accumulate function.
Using a loop in C style.

I wanted to perform some benchmark by using Quick Bench and enabling all the optimizations.

First of all, I compared the two C++ alternatives with floating values. This is the code used by using std::accumulate:

const auto predicate = [](const double previous, const double current) {
    return previous + current * current;
};
const auto result = std::accumulate(input.cbegin(), input.cend(), 0, predicate);

Versus this code by using the std::inner_product functionality:

const auto result = std::inner_product(input.cbegin(), input.cend(), input.cbegin(), 1);

After running the benchmark with all the optimization enabled, I got this result:

Both algorithms seem to reach the same performance. I did want to go further and try the C implementation:

double result = 0;
for (auto i = 0; i < input.size(); ++i) {
  result += input[i] * input[i];
}

And surprisingly, I found:

I was not expecting this result. I was sure there is something wrong so I did check the GCC implementation:

template<typename _InputIterator1, typename _InputIterator2, typename _Tp>
inline _Tp
inner_product(_InputIterator1 __first1, _InputIterator1 __last1,
      _InputIterator2 __first2, _Tp __init)
{
  // concept requirements
  __glibcxx_function_requires(_InputIteratorConcept<_InputIterator1>)
  __glibcxx_function_requires(_InputIteratorConcept<_InputIterator2>)
  __glibcxx_requires_valid_range(__first1, __last1);

  for (; __first1 != __last1; ++__first1, (void)++__first2)
__init = __init + (*__first1 * *__first2);
  return __init;
}

I found that It was doing the same as the C implementation. After reviewing the implementation, I discovered something weird, (or at least I was not expecting to have that significant impact): in all the internal accumulations, it was doing a cast from the iterator value_type to the type of the initial value.

In my case, I was initializing the initial values to 0 or 1, the values were considered integers and in each accumulation, the compiler was doing the casting. In the different test cases, my input array stores truncated floating points, so the result did not change.

After updating the initial value to a double type:

const auto result = std::accumulate(input.cbegin(), input.cend(), 0.0, predicate);

And:

const auto result = std::inner_product(input.cbegin(), input.cend(), input.cbegin(), 0.0);

I got the expected result:

Now, I understand that leaving the initial value to be an independent type from the underlying type of the iterator may make the function more flexible and allow to do more things. But,

If I am accumulating elements of an array, I am expecting to get the same type as a result. Same for the inner product.

Should it be the default behaviour?

Why did the standard decide to perform it in this way?

标签： c++ stl c++14 profiling language-lawyer

1条回答

闹够了就滚

2楼-- · 2019-02-17 08:37

If I am accumulating elements of an array, I am expecting to get the same type as a result.

Your expectation is wrong (though it is not quite clear what "same type as result" means), as you can clearly see from std::accumulate documentation:

template< class InputIt, class T >
T accumulate( InputIt first, InputIt last, T init );

template< class InputIt, class T, class BinaryOperation >
T accumulate( InputIt first, InputIt last, T init,
              BinaryOperation op );

return type is exactly the same type you use for initial value. The same effect you can have on the loop:

auto result = 0; // vs auto result = 0.0;
for (auto i = 0; i < input.size(); ++i) {
  result += input[i] * input[i];
}

Why did the standard decide to perform it in this way?

Because this way you can decide what type you use to aggregate. Note std::accumulate can be used for left fold and cases when T not equal to std::iterator_traits<InputIt>::value_type not less often (probably even more) than when they match.

0人赞添加讨论(0) 举报

Difference in performance: std::accumulate vs std:

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间