并行使用的std ::线程？(A parallel for using std::thread?)

2019-07-17 14:15发布

站内文章 / 后端开发

84 0

仙女界的扛把子

女 | 书童

私信

我是新用的std ::螺纹和我尝试编写一个parallel_for 。我编码以下的事情：

// parallel_for.cpp
// compilation: g++ -O3 -std=c++0x parallel_for.cpp -o parallel_for -lpthread
// execution: time ./parallel_for 100 50000000 
// (100: number of threads, 50000000: vector size)
#include <iostream>
#include <iomanip>
#include <cstdlib>
#include <vector>
#include <thread>
#include <cmath>
#include <algorithm>
#include <numeric>
#include <utility>

// Parallel for
template<typename Iterator, class Function>
void parallel_for(const Iterator& first, const Iterator& last, Function&& f, const int nthreads = 1, const int threshold = 1000)
{
    const unsigned int group = std::max(std::max(1, std::abs(threshold)), (last-first)/std::abs(nthreads));
    std::vector<std::thread> threads;
    for (Iterator it = first; it < last; it += group) {
        threads.push_back(std::thread([=](){std::for_each(it, std::min(it+group, last), f);}));
    }
    std::for_each(threads.begin(), threads.end(), [=](std::thread& x){x.join();});
}

// Function to apply
template<typename Type>
void f1(Type& x)
{
    x = std::sin(x)+std::exp(std::cos(x))/std::exp(std::sin(x)); 
}

// Main
int main(int argc, char* argv[]) {

    const unsigned int nthreads = (argc > 1) ? std::atol(argv[1]) : (1);
    const unsigned int n = (argc > 2) ? std::atol(argv[2]) : (100000000);
    double x = 0;
    std::vector<double> v(n);
    std::iota(v.begin(), v.end(), 0);

    parallel_for(v.begin(), v.end(), f1<double>, nthreads);

    for (unsigned int i = 0; i < n; ++i) x += v[i];
    std::cout<<std::setprecision(15)<<x<<std::endl;
    return 0;
}

但是，这是不工作：（++ 4.6从克错误代码）

parallel_for.cpp: In instantiation of ‘parallel_for(const Iterator&, const Iterator&, Function&&, int, int) [with Iterator = __gnu_cxx::__normal_iterator<double*, std::vector<double> >, Function = void (&)(double&)]::<lambda()>’:
parallel_for.cpp:22:9:   instantiated from ‘void parallel_for(const Iterator&, const Iterator&, Function&&, int, int) [with Iterator = __gnu_cxx::__normal_iterator<double*, std::vector<double> >, Function = void (&)(double&)]’
parallel_for.cpp:43:58:   instantiated from here
parallel_for.cpp:22:89: erreur: field ‘parallel_for(const Iterator&, const Iterator&, Function&&, int, int) [with Iterator = __gnu_cxx::__normal_iterator<double*, std::vector<double> >, Function = void (&)(double&)]::<lambda()>::__f’ invalidly declared function type

如何解决这个问题呢？

编辑：这新版本的编译，但是没有得到良好的结果：

// parallel_for.cpp
// compilation: g++ -O3 -std=c++0x parallel_for.cpp -o parallel_for -lpthread
// execution: time ./parallel_for 100 50000000 
// (100: number of threads, 50000000: vector size)
#include <iostream>
#include <iomanip>
#include <cstdlib>
#include <vector>
#include <thread>
#include <cmath>
#include <algorithm>
#include <numeric>
#include <utility>

// Parallel for
template<typename Iterator, class Function>
void parallel_for(const Iterator& first, const Iterator& last, Function&& f, const int nthreads = 1, const int threshold = 1000)
{
    const unsigned int group = std::max(std::max(1, std::abs(threshold)), (last-first)/std::abs(nthreads));
    std::vector<std::thread> threads;
    for (Iterator it = first; it < last; it += group) {
        threads.push_back(std::thread([=, &f](){std::for_each(it, std::min(it+group, last), f);}));
    }
    std::for_each(threads.begin(), threads.end(), [](std::thread& x){x.join();});
}

// Function to apply
template<typename Type>
void f(Type& x)
{
    x = std::sin(x)+std::exp(std::cos(x))/std::exp(std::sin(x)); 
}

// Main
int main(int argc, char* argv[]) {

    const unsigned int nthreads = (argc > 1) ? std::atol(argv[1]) : (1);
    const unsigned int n = (argc > 2) ? std::atol(argv[2]) : (100000000);
    double x = 0;
    double y = 0;
    std::vector<double> v(n);

    std::iota(v.begin(), v.end(), 0);
    std::for_each(v.begin(), v.end(), f<double>);
    for (unsigned int i = 0; i < n; ++i) x += v[i];

    std::iota(v.begin(), v.end(), 0);
    parallel_for(v.begin(), v.end(), f<double>, nthreads);
    for (unsigned int i = 0; i < n; ++i) y += v[i];

    std::cout<<std::setprecision(15)<<x<<" "<<y<<std::endl;
    return 0;
}

其结果是：

./parallel_for 1 100
155.524339894552 4950

而连续的版本，返回155并行版本返回4950 .....问题出在哪里？

Answer 1:

你必须通过引用捕捉功能。
[=, &f] () { /* your code */ };
看代码。
```
 #include <iostream> template <class T> void foo(const T& t) { const int a = t; [&] { std::cout << a << std::endl; }(); } int main() { foo(42); return 0; } 
```
该铛给出输出42 ，但克++引发警告： 'a' is used uninitialized in this function ，并打印0 。看起来像一个bug。
解决方法 ：使用const auto （可变group在代码中）。
UPD：我认为，就是这样。 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52026

Answer 2:

你必须在（最后一）浇注或者类型转换。其原因是，类型转换从未模板参数推导过程中完成。

这只是正常（也解决了这个问题DeadMG和Ben福格特找到）。两个版本都给予156608294.151782其中n =亿。

template<typename Iterator, class Function>
void parallel_for(const Iterator& first, const Iterator& last, Function&& f, const int nthreads = 1, const int threshold = 1000)
{
    const unsigned int group = std::max(std::max(ptrdiff_t(1), ptrdiff_t(std::abs(threshold))), ((last-first))/std::abs(nthreads));
    std::vector<std::thread> threads;
    threads.reserve(nthreads);
    Iterator it = first;
    for (; it < last-group; it += group) {
        threads.push_back(std::thread([=,&f](){std::for_each(it, std::min(it+group, last), f);}));
    }
    std::for_each(it, last, f); // last steps while we wait for other threads

    std::for_each(threads.begin(), threads.end(), [](std::thread& x){x.join();});
}

由于步骤for_each(it, last, f)比其他人小，我们不妨使用调用线程来完成，虽然等待的其他结果。

Answer 3:

一个问题是， it += group可以last合法，但创建关底的值是不确定的行为。只是检查it < last是来不及修复。

你需要测试，而不是last - it虽然it仍然是有效的。（无论it + group也不last - group是一定是安全的，尽管后者应该是由于道路group被计算出来。）

例如：

template<typename Iterator, class Function>
void parallel_for(const Iterator& first, const Iterator& last, Function f, const int nthreads = 1, const int threshold = 100)
{
    const unsigned int group = std::max(std::max(1, std::abs(threshold)), (last-first)/std::abs(nthreads));
    std::vector<std::thread> threads;
    threads.reserve(nthreads);
    Iterator it = first;
    for (; last - it > group; it += group) {
        threads.push_back(std::thread([=, &f](){std::for_each(it, it+group, last), f);}));
    }
    threads.push_back(std::thread([=, &f](){std::for_each(it, last, f);}));

    std::for_each(threads.begin(), threads.end(), [](std::thread& x){x.join();});
}

Answer 4:

你给std::min(it+group, last)来std::for_each ，但总是添加group坚持到最后。这意味着，如果last是不是多group从it ，你将移动it过去的last ，这是UB。

Answer 5:

您需要通过参考捕捉的，你需要（最后一）浇注或者类型转换。其原因是，类型转换从未模板参数推导过程中完成。

此外，固定DeadMG发现了这个问题，你最终与下面的代码。

它工作得很好，两个版本给156608294.151782其中n =亿。

template<typename Iterator, class Function>
void parallel_for(const Iterator& first, const Iterator& last, Function&& f, const int nthreads = 1, const int threshold = 1000)
{
    const unsigned int group = std::max(std::max(ptrdiff_t(1), ptrdiff_t(std::abs(threshold))), ((last-first))/std::abs(nthreads));
    std::vector<std::thread> threads;
    Iterator it = first;
    for (; it < last-group; it += group) {
        threads.push_back(std::thread([=,&f](){std::for_each(it, std::min(it+group, last), f);}));
    }
    std::for_each(it, last, f); // use calling thread while we wait for the others
    std::for_each(threads.begin(), threads.end(), [](std::thread& x){x.join();});
}

Answer 6:

VC11解决方案，请让我知道如果不it's与海湾合作委员会的工作。

template<typename Iterator, class Function>
void parallel_for( const Iterator& first, const Iterator& last, Function&& f, const size_t nthreads = std::thread::hardware_concurrency(), const size_t threshold = 1 )
{
    const size_t portion = std::max( threshold, (last-first) / nthreads );
    std::vector<std::thread> threads;
    for ( Iterator it = first; it < last; it += portion )
    {
        Iterator begin = it;
        Iterator end = it + portion;
        if ( end > last )
            end = last;

        threads.push_back( std::thread( [=,&f]() {
            for ( Iterator i = begin; i != end; ++i )
                f(i);
        }));
    }
    std::for_each(threads.begin(), threads.end(), [](std::thread& x){x.join();});
}

文章来源: A parallel for using std::thread?