This question is similar to How to call a function on a thread's creation and exit? but more specific. In another multi-process shared memory project I used a combination of an __attribute__((constructor)) labeled library init routine, lazy initialisation for each thread, and robust futexes to make sure resources weren't leaked in the shared memory even if a sys admin chose to SIGKILL one of the processes using it. However futexes within the APIs are way too heavyweight for my current project and even the few instructions to deke around some lazy initialisation is something I'd rather avoid. The library APIs will literally be called several trillion times over a few hundred threads across several processes (each API is only a couple hundred instructions.)
I am guessing the answer is no, but since I spent a couple hours looking for and not finding a definitive answer I thought I'd ask it here, then the next person looking for a simple answer will be able to find it more quickly.
My goal is pretty simple: perform some per-thread initialisation as threads are created in multiple processes asynchronously, and robustly perform some cleanup at some point when threads are destroyed asynchronously. Doesn't have to be immediately, it just has to happen eventually.
Some hypothetical ideas to engage critical thinking: a hypothetical pthread_atclone() called from an __attribute__((constructor)) labeled library init func would satisfy the first condition. And an extension to futex()es to add a semop-like operation with a per-thread futex_adj value that, if non-zero in do_exit(), causes FUTEX_OWNER_DIED to be set for the futex "semaphore" allowing cleanup the next time the futex is touched.
Well, first, you should document that library users should not asynchronously terminate threads in such a manner that they dont explictly release resources belonging to your library, (closing a handle, whatever), TBH, just terminating threads at all before process termination is a bad idea.
It's more difficult to detect if a whole process is SIGKILLed while it's using your lib. My current best guess is that all processes wishing to use your library have to log in first so that their pid can be added to a container. Using a thread started at your lib initialization, poll for pid's that have diappeared with kill(pid,0) and take any approriate cleanup. It's not very satisfactory, (I hate polling), but I don't see any alternatives that are not grossly messy:(
After research and experimentation I've come up with what seems to be current "best practice" as far as I can tell. If anyone knows any better, please comment!
For the first part, per-thread initialisation, I was not able to come up with any alternative to straightforward lazy initialisation. However, I did decide that it's slightly more efficient to move the branch to the caller so that pipelining in the new stack frame isn't immediately confronted with an effectively unnecessary branch. so instead of this:
This:
Comments on the (admittedly slight) usefulness of this welcome!
For the second part, robustly performing some cleanup when threads die no matter how asynchronously, I was not able to find any solution better than to have a reaping process epoll_wait() on a file descriptor for the read end of an open pipe passed to it via an SCM_RIGHTS control message in a sendmsg() call on an abstract UNIX domain socket address. Sounds complex, but it's not that bad, here's the client side:
And the reaper's code:
At first I tried using eventfd() rather than pipe() but eventfd file descriptors represent objects not connections, so closing the fd in the client code did not produce an EPOLLHUP in the reaper. If anyone knows of a better alternative to pipe() for this, let me know!
For completeness here's the #defines used to construct the abstract address:
That's it, hope this is useful for someone.