Is it safe to fork from within a thread?

2019-01-06 11:09发布

问题:

Let me explain: I have already been developing an application on Linux which forks and execs an external binary and waits for it to finish. Results are communicated by shm files that are unique to the fork + process. The entire code is encapsulated within a class.

Now I am considering threading the process in order to speed things up. Having many different instances of class functions fork and execute the binary concurrently (with different parameters) and communicate results with their own unique shm files.

Is this thread safe? If I fork within a thread, apart from being safe, is there something I have to watch for? Any advice or help is much appreciated!

回答1:

forking, even with threads, is safe. Once you fork, the threads are independent per process. (That is, threading is orthogonal to forking). However, if the threads in different processes use the same shared memory to comunicate, you have to devise a synchronization mechanism.



回答2:

The problem is that fork() only copies the calling thread, and any mutexes held in child threads will be forever locked in the forked child. The pthread solution was the pthread_atfork() handlers. The idea was you can register 3 handlers: one prefork, one parent handler, and one child handler. When fork() happens prefork is called prior to fork and is expected to obtain all application mutexes. Both parent and child must release all mutexes in parent and child processes respectively.

This isn't the end of the story though! Libraries call pthread_atfork to register handlers for library specific mutexes, for example Libc does this. This is a good thing: the application can't possibly know about the mutexes held by 3rd party libraries, so each library must call pthread_atfork to ensure it's own mutexes are cleaned up in the event of a fork().

The problem is that the order that pthread_atfork handlers are called for unrelated libraries is undefined (it depends on the order that the libraries are loaded by the program). So this means that technically a deadlock can happen inside of a prefork handler because of a race condition.

For example, consider this sequence:

  1. Thread T1 calls fork()
  2. prefork handlers for libc obtained in T1
  3. Next, in Thread T2, a 3rd party library A acquires its own mutex AM, and then makes a libc call which requires a mutex. This blocks, because libc mutexes are held by T1.
  4. Thread T1 runs prefork handler for library A, which blocks waiting to obtain AM, which is held by T2.

There's your deadlock and its unrelated to your own mutexes or code.

This actually happened on a project I once worked on. The advice I had found at that time was to choose fork or threads but not both. But for some applications that's probably not practical.



回答3:

It's safe to fork in a multithreaded program as long as you are very careful about the code between fork and exec. You can make only re-enterant (aka asynchronous-safe) system calls in that span. In theory, you are not allowed to malloc or free there, although in practice the default Linux allocator is safe, and Linux libraries came to rely on it End result is that you must use the default allocator.



回答4:

While you can use Linux's NPTL pthreads(7) support for your program, threads are an awkward fit on Unix systems, as you've discovered with your fork(2) question.

Since fork(2) is a very cheap operation on modern systems, you might do better to just fork(2) your process when you have more handling to perform. It depends upon how much data you intend to move back and forth, the share-nothing philosophy of forked processes is good for reducing shared-data bugs but does mean you either need to create pipes to move data between processes or use shared memory (shmget(2) or shm_open(3)).

But if you choose to use threading, you can fork(2) a new process, with the following hints from the fork(2) manpage:

   *  The child process is created with a single thread — the
      one that called fork().  The entire virtual address space
      of the parent is replicated in the child, including the
      states of mutexes, condition variables, and other pthreads
      objects; the use of pthread_atfork(3) may be helpful for
      dealing with problems that this can cause.


回答5:

Back at the Dawn of Time, we called threads "lightweight processes" because while they act a lot like processes, they're not identical. The biggest distinction is that threads by definition live in the same address space of one process. This has advantages: switching from thread to thread is fast, they inherently share memory so inter-thread communications are fast, and creating and disposing of threads is fast.

The distinction here is with "heavyweight processes", which are complete address spaces. A new heavyweight process is created by fork(2). As virtual memory came into the UNIX world, that was augmented with vfork(2) and some others.

A fork(2) copies the entire address space of the process, including all the registers, and puts that process under the control of the operating system scheduler; the next time the scheduler comes around, the instruction counter picks up at the next instruction -- the forked child process is a clone of the parent. (If you want to run another program, say because you're writing a shell, you follow the fork with an exec(2) call, which loads that new address space with a new program, replacing the one that was cloned.)

Basically, your answer is buried in that explanation: when you have a process with many LWPs threads and you fork the process, you will have two independent processes with many threads, running concurrently.

This trick is even useful: in many programs, you have a parent process that may have many threads, some of which fork new child processes. (For example, an HTTP server might do that: each connection to port 80 is handled by a thread, and then a child process for something like a CGI program could be forked; exec(2) would then be called to run the CGI program in place of the parent process close.)



回答6:

Provided you quickly either call exec or _exit in the forked child process, you're ok in practice.

You might want to use posix_spawn() instead which will probably do the Right Thing.



回答7:

If you are using the unix 'fork()' system call, then you are not technically using threads- you are using processes- they will have their own memory space, and therefore cannot interfere with eachother.

As long as each process uses different files, there should not be any issue.