Copy every other element using standard algorithms

2019-02-27 11:16发布

问题:

say I have a std::vector with N elements. I would like to copy every n-th element of it to a new vector, or average up to that element then copy it (downsample the original vector). So I want to do this

std::vector<double> vec(N);
long n = 4;
std::vector<double> ds(N/n);
for(long i = 0; i < ds.size(); i+=n)
{
    ds[i] = vec[i*n];
}

or

for(long i = 0; i < ds.size(); i+=n)
{
    double tmp = 0;    
    for(long j = 0; j < n; j++)
    {
        tmp += vec[i*n+j];
    }
    ds[i] = tmp/static_cast<double>(n);
}

Is there a way to do this using the standard algorithms of C++? Like using std::copy with binary functions? I have billions of elements that I want to treat this way, and I want this to be as fast as possible.

PS: I would prefer not to use external libraries such as boost.

回答1:

For readability, the loop would be a good idea, as pointed out by Vlad in the comments. But if you really want to do someting like this, you could try:

int cnt=0,n=3; 
vector<int> u(v.size()/3); 
copy_if (v.begin(), v.end(), u.begin(), 
          [&cnt,&n] (int i)->bool {return ++cnt %n ==0; } ); 

If you want to average, it's getting worse as you'd have to similar tricks combining transform() with copy_if().

Edit: If you're looking for performance, you'd better stick to the loop, as stressed in the comments by davidhigh: it will avoid the overhead of the call to the lambda function for each element.

If you're looking for an algorithm because you're doing this very often, you'd better write your own generic one.



回答2:

If to use only standard library features and algorithms and if it is not allowed to use loops then the code can look the following way. Take into account that the code is based on the C++ 2014. If you need a code that will be compiled by a compiler that supports only C++ 2011 then you have to make some minor changes.

#include <iostream>
#include <vector>
#include <algorithm>
#include <numeric>
#include <iterator>

int main()
{
    const size_t N = 4;
    std::vector<double> src = { 1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.8, 9.9 };
    size_t n = src.size() / N;
    std::vector<double> dst( n );

    std::copy_if( src.begin(), std::next( src.begin(), n * N ), dst.begin(),
                  [i = 0] ( auto ) mutable { return ( i = ( i + 1 ) % N ) == 0; } );

    for ( double x : dst ) std::cout << x << ' ';
    std::cout << std::endl;

    dst.assign( n, 0.0 );

    std::accumulate( src.begin(), std::next( src.begin(), n * N ), dst.begin(),
                     [i = 0] ( auto acc, auto x ) mutable
                     {
                         *acc += x; 
                         if ( ( i = ( i + 1 ) % N ) == 0 )  *acc++ /= N;
                         return acc;
                     } );

    for ( double x : dst ) std::cout << x << ' ';
    std::cout << std::endl;
}    

The program output is

4.4 8.8 
2.75 7.15 

This compound expression in the if condition

if ( ( i = ( i + 1 ) % N ) == 0 )  *acc++ /= N;

you can replace with more simpler one

if ( ++i % N == 0 )  *acc++ /= N;


回答3:

You could write your own generic algorithms inspired from the design principles used in <algorithm>.

For the copy of every n elements:

template<class in_it, class out_it>
out_it copy_every_n( in_it b, in_it e, out_it r, size_t n) {
    for (size_t i=distance(b,e)/n; i--; advance (b,n)) 
        *r++ = *b;
    return r;
}

Example of use:

vector<int> v {1,2,3,4,5,6,7,8,9,10};
vector<int> z(v.size()/3); 
copy_every_n(v.begin(), v.end(), z.begin(), 3);     

For averaging the elements n by n, you can use:

template<class in_it, class out_it>
out_it average_every_n( in_it b, in_it e, out_it r, size_t n) {
    typename out_it::value_type tmp=0;
    for (size_t cnt=0; b!=e; b++)  {
        tmp+=*b;
        if (++cnt==n) {
            cnt=0; 
            *r++=tmp/n;
            tmp=0;
        }
    }
    return r;
}

Example of use:

vector<int> w(v.size()/3); 
average_every_n(v.begin(), v.end(), w.begin(), 3);  

The advantage over your inital loops, is that this will work not only on vectors, but on any container providing the begin() and end() iterator. And it avoids overheads that I pointed out in my other answer.



回答4:

You may have explicitly stated that you prefer not to use Boost, but any non-Boost solution would really be implementing exactly this sort of thing anyway, so I'll show you how I would do it in Boost. Ultimately, I think you're better off writing a simple loop.

Downsampling uses strided

boost::copy(
        input | strided(2),
        std::back_inserter(output));

Downsampling average additionally uses transformed, though this solution is non-generic and specifically relies upon vector being contiguous:

boost::copy(
        input | strided(2) | transformed([](auto& x){
                return std::accumulate(&x, &x + 2, 0) / 2.;
            }),
        std::back_inserter(output));

Of course that has issues if the input isn't an exact multiple of the stride length, so it'd probably be better to do something like:

auto downsample_avg = [](auto& input, int n){
    return input | strided(n) | transformed([&,n](auto& x){
        auto begin = &x;
        auto end = begin + std::min<size_t>(n, &input.back() - begin + 1);
        return std::accumulate(begin, end, 0.0) / (end - begin);
    });
};

boost::copy(
    downsample_avg(input, 2),
    std::back_inserter(output));