I want to know why std::accumulate
(aka reduce) 3rd parameter is needed. For those who do not know what accumulate
is, it's used like so:
vector<int> V{1,2,3};
int sum = accumulate(V.begin(), V.end(), 0);
// sum == 6
Call to accumulate
is equivalent to:
sum = 0; // 0 - value of 3rd param
for (auto x : V) sum += x;
There is also optional 4th parameter, which allow to replace addition with any other operation.
Rationale that I've heard is that if you need let say not to add up, but multiply elements of a vector, we need other (non-zero) initial value:
vector<int> V{1,2,3};
int product = accumulate(V.begin(), V.end(), 1, multiplies<int>());
But why not do like Python - set initial value for V.begin()
, and use range starting from V.begin()+1
. Something like this:
int sum = accumulate(V.begin()+1, V.end(), V.begin());
This will work for any op. Why is 3rd parameter needed at all?
If you wanted
accumulate(V.begin()+1, V.end(), V.begin())
you could just write that. But what if you thought v.begin() might be v.end() (i.e. v is empty)? What ifv.begin() + 1
is not implemented (because v only implements ++, not generized addition)? What if the type of the accumulator is not the type of the elements? Eg.It's indeed not needed. Our codebase has 2 and 3-argument overloads which use a
T{}
value.However,
std::accumulate
is pretty old; it comes from the original STL. Our codebase has fancystd::enable_if
logic to distinguish between "2 iterators and initial value" and "2 iterators and reduction operator". That requires C++11. Our code also uses a trailing return type (auto accumulate(...) -> ...
) to calculate the return type, another C++11 feature.The way things are, it is annoying for code that knows for sure a range isn't empty and that wants to start accumulating from the first element of the range on. Depending on the operation that is used to accumulate with, it's not always obvious what the 'zero' value to use is.
If on the other hand you only provide a version that requires non-empty ranges, it's annoying for callers that don't know for sure that their ranges aren't empty. An additional burden is put on them.
One perspective is that the best of both worlds is of course to provide both functionality. As an example, Haskell provides both
foldl1
andfoldr1
(which require non-empty lists) alongsidefoldl
andfoldr
(which mirrorstd::transform
).Another perspective is that since the one can be implemented in terms of the other with a trivial transformation (as you've demonstrated:
std::transform(std::next(b), e, *b, f)
--std::next
is C++11 but the point still stands), it is preferable to make the interface as minimal as it can be with no real loss of expressive power.Because standard library algorithms are supposed to work for arbitrary ranges of (compatible) iterators. So the first argument to
accumulate
doesn't have to bebegin()
, it could be any iterator betweenbegin()
and one beforeend()
. It could also be using reverse iterators.The whole idea is to decouple algorithms from data. Your suggestion, if I understand it correctly, requires a certain structure in the data.
You're making a mistaken assumption: that type
T
is of the same type as theInputIterator
.But
std::accumulate
is generic, and allows all different kinds of creative accumulations and reductions.Example #1: Accumulate salary across Employees
Here's a simple example: an
Employee
class, with many data fields.You can't meaningfully "accumulate" a set of employees. That makes no sense; it's undefined. But, you can define an accumulation regarding the employees. Let's say we want to sum up all the monthly pay of all employees.
std::accumulate
can do that:So in this example, we're accumulating an
int
value over a collection ofEmployee
objects. Here, the accumulation sum isn't the same type of variable that we're actually summing over.Example #2: Accumulating an average
You can use
accumulate
for more complex types of accumulations as well - maybe want to append values to a vector; maybe you have some arcane statistic you're tracking across the input; etc. What you accumulate doesn't have to be just a number; it can be something more complex.For example, here's a simple example of using
accumulate
to calculate the average of a vector of ints:Example #3: Accumulate a running average
Another reason you need the initial value is because that value isn't always the default/neutral value for the calculation you're making.
Let's build on the average example we've already seen. But now, we want a class that can hold a running average -- that is, we can keep feeding in new values, and check the average so far, across multiple calls.
This is a case where we absolutely rely on being able to set that initial value for
std::accumulate
- we need to be able to initialize the accumulation from different starting points.In summary,
std::accumulate
is good for any time you're iterating over an input range, and building up one single result across that range. But the result doesn't need to be the same type as the range, and you can't make any assumptions about what initial value to use -- which is why you must have an initial instance to use as the accumulating result.