可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I would like to know exactly how the execution of asynchronous signal handlers works on Linux. First, I am unclear as to which thread executes the signal handler. Second, I would like to know the steps that are followed to make the thread execute the signal handler.

On the first matter, I have read two different, seemingly conflicting, explanations:

The Linux Kernel, by Andries Brouwer, §5.2 "Receiving signals" states:

When a signal arrives, the process is interrupted, the current registers are saved, and the signal handler is invoked. When the signal handler returns, the interrupted activity is continued.
The StackOverflow question "Dealing With Asynchronous Signals In Multi Threaded Program" leads me to think that Linux's behavior is like SCO Unix's:
When a signal is delivered to a process, if it is being caught, it will be handled by one, and only one, of the threads meeting either of the following conditions:
1. A thread blocked in a sigwait(2) system call whose argument does include the type of the caught signal.
2. A thread whose signal mask does not include the type of the caught signal.
Additional considerations:
- A thread blocked in sigwait(2) is given preference over a thread not blocking the signal type.
- If more than one thread meets these requirements (perhaps two threads are calling sigwait(2)), then one of them will be chosen. This choice is not predictable by application programs.
- If no thread is eligible, the signal will remain ``pending'' at the process level until some thread becomes eligible.
Also, "The Linux Signals Handling Model" by Moshe Bar states "Asynchronous signals are delivered to the first thread found not blocking the signal.", which I interpret to mean that the signal is delivered to some thread having its sigmask not including the signal.

Which one is correct?

On the second matter, what happens to the stack and register contents for the selected thread? Suppose the thread-to-run-the-signal-handler T is in the middle of executing a do_stuff() function. Is thread T's stack used directly to execute the signal handler (i.e. the address of the signal trampoline is pushed onto T's stack and control flow goes to the signal handler)? Alternatively, is a separate stack used? How does it work?

回答1:

Source #1 (Andries Brouwer) is correct for a single-threaded process. Source #2 (SCO Unix) is wrong for Linux, because Linux does not prefer threads in sigwait(2). Moshe Bar is correct about the first available thread.

Which thread gets the signal? Linux's manual pages are a good reference. A process uses clone(2) with CLONE_THREAD to create multiple threads. These threads belong to a "thread group" and share a single process ID. The manual for clone(2) says,

Signals may be sent to a thread group as a whole (i.e., a TGID) using kill(2), or to a specific thread (i.e., TID) using tgkill(2).

Signal dispositions and actions are process-wide: if an unhandled signal is delivered to a thread, then it will affect (terminate, stop, continue, be ignored in) all members of the thread group.

Each thread has its own signal mask, as set by sigprocmask(2), but signals can be pending either: for the whole process (i.e., deliverable to any member of the thread group), when sent with kill(2); or for an individual thread, when sent with tgkill(2). A call to sigpending(2) returns a signal set that is the union of the signals pending for the whole process and the signals that are pending for the calling thread.

If kill(2) is used to send a signal to a thread group, and the thread group has installed a handler for the signal, then the handler will be invoked in exactly one, arbitrarily selected member of the thread group that has not blocked the signal. If multiple threads in a group are waiting to accept the same signal using sigwaitinfo(2), the kernel will arbitrarily select one of these threads to receive a signal sent using kill(2).

Linux is not SCO Unix, because Linux might give the signal to any thread, even if some threads are waiting for a signal (with sigwaitinfo, sigtimedwait, or sigwait) and some threads are not. The manual for sigwaitinfo(2) warns,

In normal usage, the calling program blocks the signals in set via a prior call to sigprocmask(2) (so that the default disposition for these signals does not occur if they become pending between successive calls to sigwaitinfo() or sigtimedwait()) and does not establish handlers for these signals. In a multithreaded program, the signal should be blocked in all threads, in order to prevent the signal being treated according to its default disposition in a thread other than the one calling sigwaitinfo() or sigtimedwait()).

The code to pick a thread for the signal lives in linux/kernel/signal.c (the link points to GitHub's mirror). See the functions wants_signal() and completes_signal(). The code picks the first available thread for the signal. An available thread is one that doesn't block the signal and has no other signals in its queue. The code happens to check the main thread first, then it checks the other threads in some order unknown to me. If no thread is available, then the signal is stuck until some thread unblocks the signal or empties its queue.

What happens when a thread gets the signal? If there is a signal handler, then the kernel causes the thread to call the handler. Most handlers run on the thread's stack. A handler can run on an alternate stack if the process uses sigaltstack(2) to provide the stack, and sigaction(2) with SA_ONSTACK to set the handler. The kernel pushes some things onto the chosen stack, and sets some of the thread's registers.

To run the handler, the thread must be running in userspace. If the thread is running in the kernel (perhaps for a system call or a page fault), then it does not run the handler until it goes to userspace. The kernel can interrupt some system calls, so the thread runs the handler now, without waiting for the system call to finish.

The signal handler is a C function, so the kernel obeys the architecture's convention for calling C functions. Each architecture, like arm, i386, powerpc, or sparc, has its own convention. For powerpc, to call handler(signum), the kernel sets the register r3 to signum. The kernel also sets the handler's return address to the signal trampoline. The return address goes on the stack or in a register by convention.

The kernel puts one signal trampoline in each process. This trampoline calls sigreturn(2) to restore the thread. In the kernel, sigreturn(2) reads some information (like saved registers) from the stack. The kernel had pushed this information on the stack before calling the handler. If there was an interrupted system call, the kernel might restart the call (only if the handler used SA_RESTART), or fail the call with EINTR, or return a short read or write.

回答2:

These two explanations really aren't contradictory if you take into account the fact that Linux hackers tend to be confused about the difference between a thread and a process, mainly due to the historical mistake of trying to pretend threads could be implemented as processes that share memory. :-)

With that said, explanation #2 is much more detailed, complete, and correct.

As for the stack and register contents, each thread can register its own alternate signal-handling stack, and the process can choose on a per-signal basis which signals will be delivered on alternate signal-handling stacks. The interrupted context (registers, signal mask, etc.) will be saved in a ucontext_t structure on the (possibly alternate) stack for the thread, along with the trampoline return address. Signal handlers installed with the SA_SIGINFO flag are able to examine this ucontext_t structure if they like, but the only portable thing they can do with it is examine (and possibly modify) the saved signal mask. (I'm not sure if modifying it is sanctioned by the standard, but it's very useful because it allows the signal handler to atomically replace the interrupted code's signal mask upon return, for instance to leave the signal blocked so it can't happen again.)