How to split a vector into n “almost equal” parts

2020-02-11 18:34发布

I have a problem that I would like to merge a large number of images using ImageMagick's convert.exe, but under Windows I have a 8192 byte long command line limit.

My solution to this is to split the task into smaller sub-task, run them, and do a final task which combines them together.

My idea is to write a function, which takes a vector of images and an integer, and splits the vector into n sub-vector all having "almost equal" parts.

So for example if I would like to split 11 into 3 groups it would be 4-4-3.

Can you tell me how can I do it in C++? I mean, to write a function

split_vec( const vector<image> &images, int split )

which does the splitting?

Also, can you tell me what is the most efficient way to do if I don't need to create new vectors, just iterate through the sub-parts? Like the std::substr function with std::string?

Note: I already use Boost in the project, so if there is some nice tool in Boost for this then it's perfect for me.

7条回答
Rolldiameter
2楼-- · 2020-02-11 18:43

To get a base number for the size of each part, simply divide the total by the number of parts: 11/3 = 3. Obviously some of the parts will need to be bigger than that to get the proper total, but that's just the remainder: 11 % 3 = 2. So now you know that 2 of the parts will be size 3+1, and whatever's left over will be 3.

查看更多
劳资没心,怎么记你
3楼-- · 2020-02-11 18:44

You can use iterators to iterate through the sub-parts of the problem. Iterators usage is similar to pointers to elements of the vector

What you want to on the images do could be implemented as a function

using namespace std; 
void do_some_work(vector<image>::iterator begin, vector<image>::iterator end) {
    vector<image>::iterator i = begin ;
    while(i != end) {
        // do something using *i , which will be of type image
        ++i ;
    }
}
查看更多
女痞
4楼-- · 2020-02-11 18:46

Here is my solution:

template<typename T>
std::vector<std::vector<T>> SplitVector(const std::vector<T>& vec, size_t n)
{
    std::vector<std::vector<T>> outVec;

    size_t length = vec.size() / n;
    size_t remain = vec.size() % n;

    size_t begin = 0;
    size_t end = 0;

    for (size_t i = 0; i < std::min(n, vec.size()); ++i)
    {
        end += (remain > 0) ? (length + !!(remain--)) : length;

        outVec.push_back(std::vector<T>(vec.begin() + begin, vec.begin() + end));

        begin = end;
    }

    return outVec;
}
查看更多
我只想做你的唯一
5楼-- · 2020-02-11 18:49

Have you thought about using the xargs program. This maybe a high-level solution to the problem.

查看更多
再贱就再见
6楼-- · 2020-02-11 18:51

CreateProcess has a 32kb limit

Or, if you want to go via the shell,

vec::const_iterator i = vec .begin ();
vec::const_iterator j = i + stride;

while (j < vec .end ()) {
    do_range (i, j);
    i = j;
    j += stride;
}

do_range (i, vec .end ());
查看更多
等我变得足够好
7楼-- · 2020-02-11 18:56

You could create a template that returns a std::vector < std::vector > and receives the vector you want split, and the number of divisions. using for and iterator is very easy.

#include <iostream>
#include <iomanip>
#include <vector>
#include <algorithm>
#include <numeric>

template<typename T>
std::vector< std::vector<T> > split(std::vector<T> vec, uint64_t n) {
  std::vector< std::vector<T> > vec_of_vecs(n);

  uint64_t quotient = vec.size() / n;
  uint64_t reminder = vec.size() % n;
  uint64_t first = 0;
  uint64_t last;
  for (uint64_t i = 0; i < n; ++i) {
    if (i < reminder) {
      last = first + quotient + 1;
      vec_of_vecs[i] = std::vector<T>(vec.begin() + first, vec.begin() + last);
      first = last;
  }
    else if (i != n - 1) {
    last = first +  quotient;
    vec_of_vecs[i] = std::vector<T>(vec.begin() + first, vec.begin() + last);
    first = last;
  }
    else
    vec_of_vecs[i] = std::vector<T>(vec.begin() + first, vec.end());
}

return vec_of_vecs;
}

#define ONE_DIMENSION 11
#define SPLITS 3

int main(void)
{
  std::vector<uint64_t> vector(ONE_DIMENSION);
  std::iota(std::begin(vector), std::end(vector), 1);

  std::vector<std::vector<uint64_t>> vecs(SPLITS);
  vecs = split(vector, SPLITS);

  for (uint64_t m = 0; m < vecs.size(); ++m) {
    for (auto i : vecs[m])
      std::cout << std::setw(3) << i << " ";
    std::cout << std::endl;
  }


  return 0;
}
查看更多
登录 后发表回答