Parallelizing a for loop using openmp & replacing

I'd like to parallelize the following piece of code but am new to openmp and creating parallel code.

std::vector<DMatch> good_matches;
for (int i = 0; i < descriptors_A.rows; i++) {
   if (matches_RM[i].distance < 3 * min_dist) {
      good_matches.push_back(matches_RM[i]);
   }
}

I have tried

std::vector<DMatch> good_matches;
#pragma omp parallel for
for (int i = 0; i < descriptors_A.rows; i++) {
   if (matches_RM[i].distance < 3 * min_dist) {
      good_matches[i] = matches_RM[i];
   }
}

and

std::vector<DMatch> good_matches;
cv::DMatch temp;
#pragma omp parallel for
for (int i = 0; i < descriptors_A.rows; i++) {
   if (matches_RM[i].distance < 3 * min_dist) {
      temp = matches_RM[i];
      good_matches[i] = temp;
      // AND ALSO good_matches.push_back(temp);
   }

I have also tried

#omp parallel critical 
good_matches.push_back(matches_RM[i]);

This clause works but does not speed anything up. It may be the case that this for loop cannot be sped up but it'd be great if it can be. I'd also like to speed this up as well

std::vector<Point2f> obj, scene;
for (int i = 0; i < good_matches.size(); i++) {
   obj.push_back(keypoints_A[good_matches[i].queryIdx].pt);
   scene.push_back(keypoints_B[good_matches[i].trainIdx].pt);
}

Apologies if this question as been answered and thank you very much to anyone who can help.

标签： c++ for-loop parallel-processing openmp

3条回答

We Are One

2楼-- · 2019-02-20 01:06

One possibility may be to use private vectors for each thread and combine them in the end:

#include<omp.h>

#include<algorithm>
#include<iterator>
#include<iostream>
#include<vector>

using namespace std;

int main()
{
  vector<int> global_vector;  
  vector< vector<int> > buffers;

  #pragma omp parallel
  {
    auto nthreads = omp_get_num_threads();
    auto id = omp_get_thread_num();
    //
    // Correctly set the number of buffers
    //
  #pragma omp single
    {
      buffers.resize( nthreads );
    }
    //
    // Each thread works on its chunk
    // If order is important maintain schedule static
    //
  #pragma omp for schedule(static)
    for(size_t ii = 0; ii < 100; ++ii) {      
      if( ii % 2 != 0 ) { // Any other condition will do
          buffers[id].push_back(ii);
      }
    }
    //
    // Combine buffers together
    //
    #pragma omp single
    {
      for( auto & buffer : buffers) {
        move(buffer.begin(),buffer.end(),back_inserter(global_vector));
      }
    }
  }
  //
  // Print the result
  //
  for( auto & x : global_vector) {
    cout << x << endl;
  }    
  return 0;
}

The actual speed-up depends only on the amount of work done inside each loop.

0人赞添加讨论(0) 举报

该账号已被封号

3楼-- · 2019-02-20 01:25

TBB's concurrent_vector acts much like std::vector, but allows parallel calls to push_back.

0人赞添加讨论(0) 举报

男人必须洒脱

4楼-- · 2019-02-20 01:27

I showed how to do this here c-openmp-parallel-for-loop-alternatives-to-stdvector

Make private versions of the std::vector and fill the shared std::vector in a critical section like this:

std::vector<DMatch> good_matches;
#pragma omp parallel
{
    std::vector<DMatch> good_matches_private;
    #pragma omp for nowait
    for (int i = 0; i < descriptors_A.rows; i++) {
       if (matches_RM[i].distance < 3 * min_dist) {
          good_matches_private.push_back(matches_RM[i]);
       }
    }
    #pragma omp critical
    good_matches.insert(good_matches.end(), good_matches_private.begin(), good_matches_private.end());
}

0人赞添加讨论(0) 举报

Parallelizing a for loop using openmp & replacing

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间