What is causing data race in std::async here?

2019-06-08 04:29发布

问题:

I recently created a pattern searching program in Conway's Game of Life, but It ran too slow to be practical.
So I decided to parallelize it, but I failed; it caused segmentation fault, which is very likely due to data race.
A brief explanation of the code:

/* ... */
#include <list>
#include <mutex>
#include <future>
#include <iostream>
#include <forward_list>
int main() {
    /* ... */
    while (true) {
        /* ... */
        std::forward_list</*pattern type*/> results;
        std::list<std::future<void>> results_future;
        std::mutex results_mutex;
        for (/*All the possible unique pattern in given grid*/)
            results_future.push_back(std::async([&]{
                /*A very hard calculation*/
                if (/*The pattern is what I'm looking for*/) {
                    std::lock_guard<std::mutex> results_guard(results_mutex);
                    results.push_front(std::move(/*The pattern*/));
                }
            }));
        while (!results_future.empty()) {
            results_future.front().wait();
            results_future.pop_front();
        }
        if (!results.empty()) {
            for (auto &res : results)
                std::cout << "Pattern found:" << std::endl << res;
            return 0;
        }
    }
}

I'm pretty sure results is the only object that is declared out of the lambda-expression's function scope and is being modified there, so I locked it with mutex.
But the data race is still present. So what is causing it?

回答1:

I found that the problem is related to the lambda capture:

for (/*All the possible unique pattern in given grid*/)
    results_future.push_back(std::async([&]{
        /*pattern type*/ patt_orig = search_grid;
        /* ... */
    }));

search_grid, as stated in SingerOfTheFall's comment above, is captured by reference. And it is converted to pattern type within lambda scope. The problem is that search_grid can be modified while it is being converted to pattern type, or vice versa. Data race!

The conversion must be within the lambda capture:

for (/*All the possible unique pattern in given grid*/)
    results_future.push_back(std::async([&, patt_orig = (/*pattern type*/)search_grid]{
        /* ... */
    }));

And everything's okay now.