First, I am aware that the mutexes are not considered async-safe normally. This question concerns the use of sigprocmask
to make mutexes safe in a multithreaded program with async signals and signal handlers.
I have some code conceptually like the following:
struct { int a, b; } gvars;
void sigfoo_handler(int signo, siginfo_t *info, void *context) {
if(gvars.a == 42 || gvars.b == 13) {
/* run a chained signal handler */
}
}
/* called from normal code */
void update_gvars(int a, int b) {
gvars.a = a;
gvars.b = b;
}
gvars
is a global variable which is too large to fit in a single sig_atomic_t
. It is updated by normal code and read from the signal handler. The controlled code is a chained signal handler, and so it must run in signal handler context (it may use info
or context
). Consequently, all accesses to gvars
have to be controlled via some sort of synchronization mechanism. Complicating matters, the program is multithreaded, and any thread may receive a SIGFOO
.
Question: By combining sigprocmask
(or pthread_sigmask
) and pthread_mutex_t
, is it possible to guarantee synchronization, using code like the following?
struct { int a, b; } gvars;
pthread_mutex_t gvars_mutex;
void sigfoo_handler(int signo, siginfo_t *info, void *context) {
/* Assume SIGFOO's handler does not have NODEFER set, i.e. it is automatically blocked upon entry */
pthread_mutex_lock(&gvars_mutex);
int cond = gvars.a == 42 || gvars.b == 13;
pthread_mutex_unlock(&gvars_mutex);
if(cond) {
/* run a chained signal handler */
}
}
/* called from normal code */
void update_gvars(int a, int b) {
sigset_t set, oset;
sigemptyset(&set);
sigaddset(&set, SIGFOO);
pthread_sigmask(SIG_BLOCK, &set, &oset);
pthread_mutex_lock(&gvars_mutex);
gvars.a = a;
gvars.b = b;
pthread_mutex_unlock(&gvars_mutex);
pthread_sigmask(SIG_SETMASK, &oset, NULL);
}
The logic goes as following: within sigfoo_handler
, SIGFOO
is blocked so it cannot interrupt the pthread_mutex_lock
. Within update_gvars
, SIGFOO
cannot be raised in the current thread during the pthread_sigmask
-protected critical region, and so it can't interrupt the pthread_mutex_lock
either. Assuming no other signals (and we can always block any other signals that could be problematic), the lock/unlocks should always proceed in a normal, uninterruptible fashion on the current thread, and the use of lock/unlock should ensure that other threads don't interfere. Am I right, or should I avoid this approach?
You obviously know you're into undefined behavior territory with your mention of sig_atomic_t. That being said, the only way I can see this exact example not working on modern unix-like systems is if the signal was set up with SA_NODEFER.
The mutex is enough to ensure proper synchronization between different threads (including the signal handler being run in another thread) and the sigmask will prevent the signal handler in this thread recursing the mutex.
That being said, you're in deep water with locks inside signal handlers. One signal handler might be safe enough, but if you had two signal handlers doing the same trick with different locks you end up with lock ordering deadlocks. This can be somewhat mitigated by applying process sigmasks instead of thread sigmasks. A simple debugging fprintf in the signal handler will definitely violate lock ordering, for example.
I would back away and redesign my application because stuff like this in a signal handler is a sign it's getting too complex and too easy to break. Signal handlers touching one sig_atomic_t are the only defined thing in the C standard because of the exploding complexity of getting anything else right.
I found this paper https://www.cs.purdue.edu/homes/rego/cs543/threads/signals.pdf that discusses running AS-unsafe code in sig handlers safely by
- masking out signals in AS-unsafe blocks of normal-context code (explored as less efficient) OR
- protecting AS-unsafe blocks of normal-context code with a global sig_atomic volatile flag that prevents the AS-unsafe code in the handler from being entered if set (explored as efficient)
This approach satisfies the part of the POSIX standard that says that
calling AS-unsafe functions in sig-handlers is only deemed unsafe if the sighandler interrupts an AS-unsafe function (http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03 : scroll down to the 1st paragraph after the function list)
I think what you're toying with here is essentially a more fine-grained version of this idea, since you're not trying to prevent
pthread_mutex_lock(&gvars_mutex);
int cond = gvars.a == 42 || gvars.b == 13;
pthread_mutex_unlock(&gvars_mutex);
run from a sig-handler from clashing with any AS-unsafe code but rather just with this same/similar AS-unsafe code dealing with this mutex and these variables.
Unfortunately, POSIX only seems to have a code-only concept of signal-safety: a function is either safe or unsafe, regardless of its arguments.
However, IMO, a semaphores/mutex has no good reason to operate on any data or OS handles other than those contained in the mutex/semaphore they're passed, so I think calling sem_wait(&sem)
/pthread_mutex_lock(&mx);
from a signal handler ought to be safe if it's guaranteed to never clash with a sem_wait
/pthread_mutex_lock
to the same mutex, even though the POSIX standard technically says it shouldn't be safe (counter-arguments more than welcome).