Ways to Find a Race Condition

2020-06-01 01:08发布

I have a bit of code with a race condition in it... I know that it is a race condition because it does not happen consistently, and it seems to happen more often on dual core machines.

It never happens when I'm tracing. Although, there is a possibility that it could be a deadlock as well. By analyzing stages of completion of logs where this does and does not occur, I've been able to pinpoint this bug to a single function. However, I do not know where in the scope of the function this is happening. It's not at the top level.

Adding log statements or breakpoints is going to change the timing if it is a race condition, and prevent this from happening.

Is there any technique that I can use aside from getting a race condition analyzer that will allow me to pinpoint where this is happening?

This is in visual studio 9, with C++ (of the nonmanaged variety).

8条回答
Rolldiameter
2楼-- · 2020-06-01 01:38

There is a tool included in CLang and gcc 4.8+ called ThreadSanitizer.

You compile your code using the -fsanitize=thread flag

Example:

$ cat simple_race.cc
#include <pthread.h>
#include <stdio.h>

int Global;

void *Thread1(void *x) {
  Global++;
  return NULL;
}

void *Thread2(void *x) {
  Global--;
  return NULL;
}

int main() {
  pthread_t t[2];
  pthread_create(&t[0], NULL, Thread1, NULL);
  pthread_create(&t[1], NULL, Thread2, NULL);
  pthread_join(t[0], NULL);
  pthread_join(t[1], NULL);
}

And the output

$ clang++ simple_race.cc -fsanitize=thread -fPIE -pie -g
$ ./a.out 
==================
WARNING: ThreadSanitizer: data race (pid=26327)
  Write of size 4 at 0x7f89554701d0 by thread T1:
    #0 Thread1(void*) simple_race.cc:8 (exe+0x000000006e66)

  Previous write of size 4 at 0x7f89554701d0 by thread T2:
    #0 Thread2(void*) simple_race.cc:13 (exe+0x000000006ed6)

  Thread T1 (tid=26328, running) created at:
    #0 pthread_create tsan_interceptors.cc:683 (exe+0x00000001108b)
    #1 main simple_race.cc:19 (exe+0x000000006f39)

  Thread T2 (tid=26329, running) created at:
    #0 pthread_create tsan_interceptors.cc:683 (exe+0x00000001108b)
    #1 main simple_race.cc:20 (exe+0x000000006f63)
==================
ThreadSanitizer: reported 1 warnings
查看更多
爷、活的狠高调
3楼-- · 2020-06-01 01:38

The best way I know of to track these down is to use CHESS in Visual Studio. This is not a simple tool to use, and will probably require testing subsections of your app progressively. Good luck.

查看更多
We Are One
4楼-- · 2020-06-01 01:42

You can use tools like Intel Inspector which are able to check for certain types of race conditions.

查看更多
Lonely孤独者°
5楼-- · 2020-06-01 01:44

Put sleeps in various parts of your code. Something that is threadsafe will be threadsafe even if it (or asynchronous code) sleeps for even seconds.

查看更多
戒情不戒烟
6楼-- · 2020-06-01 01:45

I've had some luck with using Visual Studio's tracepoints to find race conditions. Of course it still affects the timing, but in the cases I used it, at least, it wasn't enough to completely prevent the race conditions from occurring. It seemed less disruptive than dedicated logging, at least.

Other than that, try posting the code allowing others to look over it. Just studying the code in detail isn't a bad way to find race conditions.

查看更多
Emotional °昔
7楼-- · 2020-06-01 01:45

It can be also a resource that is not protected, which can explain non-consistent behaviour (especially if on a single core it's working fine and not on dual core). In any case, code review (for both race conditions and non thread-safe source code) can be the shortest path to the solution.

查看更多
登录 后发表回答