Helgrind (Valgrind) and OpenMP (C): avoiding false

2019-01-23 22:49发布

问题:

The documentation for the Valgrind thread error detection tool Helgrind, found here

warns that, if you use GCC to compile your OpenMP code, GCC's OpenMP runtime library (libgomp.so) will cause a chaos of false positive reports of data races, because of its use of atomic machine instructions and Linux futex system calls instead of POSIX pthreads primitives. It tells you that you can solve this problem, however, by recompiling GCC with the --disable-linux-futex configuration option.

So I tried this. I compiled and installed to a local directory (~/GCC_Valgrind/gcc_install) a new GCC version 4.7.0 (the latest release as of this writing) with the --disable-linux-futex configuration option. I then created a small OpenMP test program (test1.c) that has no visible data races:

/* test1.c */

#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

#define NUM_THREADS 2

int a[NUM_THREADS];

int main(void) {
        int i;
#pragma omp parallel num_threads(NUM_THREADS)
        {
                int tid = omp_get_thread_num();
                a[tid] = tid + 1;
        }
        for (i = 0; i < NUM_THREADS; i++)
                printf("%d ", a[i]);
        printf("\n");
        return EXIT_SUCCESS;
}

I compiled this program as follows

~/GCC_Valgrind/gcc_install/bin/gcc -Wall -fopenmp  -static -L~/GCC_Valgrind/gcc_install/lib64 -L~/GCC_Valgrind/gcc_install/lib -o test1 test1.c

However, I got 30 false positive data race reports!--all occurring in libgomp code. I then compiled test1.c without the -static flag, and ran Helgrind on it again. This time, I got only 9 false positive data race reports, but that is still too many--and, without the -static flag, I cannot trace the supposed race in the libgomp code.

Has anybody found a way to reduce, if not eliminate, the number of false positive data race reports from Helgrind applied to an OpenMP program compiled with GCC? Thanks!

回答1:

Sorry to put this in as an answer since it's more of a comment, but it's too long to fit in as a comment, so here goes:

From the site you referenced.

Runtime support library for GNU OpenMP (part of GCC), at least for GCC versions 4.2 and 4.3. The GNU OpenMP runtime library (libgomp.so) constructs its own synchronisation primitives using combinations of atomic memory instructions and the futex syscall, which causes total chaos since in Helgrind since it cannot "see" those.

Fortunately, this can be solved using a configuration-time option (for GCC). Rebuild GCC from source, and configure using --disable-linux-futex. This makes libgomp.so use the standard POSIX threading primitives instead. Note that this was tested using GCC 4.2.3 and has not been re-tested using more recent GCC versions. We would appreciate hearing about any successes or failures with more recent versions.

as you mentioned in your post, this has to do with libgomp.so, but that's a shared object, so I don't see how you can pass the -static flag and still use that library. Am I just misinformed?



回答2:

Steps which will make it work:

  1. Recompile gcc (including libgomp) using --disable-linux-futex
  2. Make sure you use the futex free gcc when compiling your program.
  3. Make sure the system will load the futex free libgomp when executing your program (the library is usually in GCC-OBJ-DIR/PLATFORM/libgomp/.libs). For example by setting the LD_LIBRARY_PATH environment variable:

export LD_LIBRARY_PATH=~/gcc-4.8.1-nofutex/x86_64-unknown-linux-gnu/libgomp/.libs: