How do you prevent a file descriptor from being copy-inherited across fork() syscalls (without closing it, of course) ?
I am looking for a way to mark a single file descriptor as NOT to be (copy-)inherited by children at fork(), something like a FD_CLOEXEC-like hack but for forks (so a FD_DONTINHERIT feature if you like). Anybody did this? Or looked into this and has a hint for me to start with?
Thank you
UPDATE:
I could use libc's __register_atfork
__register_atfork(NULL, NULL, fdcleaner, NULL)
to close the fds in child just before fork() returns. However, the fds are still being copied so this sounds like a silly hack to me. Question is how to skip the dup()-ing in child of unneeded fds
I'm thinking of some scenarios when a fcntl(fd,F_SETFL,F_DONTINHERIT) would be needed:
fork() will copy an event fd (e.g. epoll); sometimes this isn't wanted, for example FreeBSD is marking the kqueue() event fd as being of a KQUEUE_TYPE and these types of fds won't be copied across forks (the kqueue fds are skipped explicitly from being copied, if one wants to use it from a child it must fork with shared fd table)
fork() will copy 100k unneeded fds to fork a child for doing some cpu-intensive tasks (suppose the need for a fork() is probabilistically very low and programmer won't want to maintain a pool of children for something that normally wouldn't happen)
Some descriptors we want to be copied (0,1,2), some (most of them?) not. I think full fdtable duping is here for historic reasons but I am probably wrong.
How silly does this sound:
- patch fcntl to support the dontinherit flag on file descriptors (not sure if the flag should be kept per-fd or in a fdtable fd_set, like the close-on-exec flags are being kept
- modify dup_fd() in kernel to skip copying of dontinherit fds, same as freebsd does for kq fds
consider the program
#include <stdio.h>
#include <unistd.h>
#include <err.h>
#include <stdlib.h>
#include <fcntl.h>
#include <time.h>
static int fds[NUMFDS];
clock_t t1;
static void cleanup(int i)
{
while(i-- >= 0) close(fds[i]);
}
void clk_start(void)
{
t1 = clock();
}
void clk_end(void)
{
double tix = (double)clock() - t1;
double sex = tix/CLOCKS_PER_SEC;
printf("fork_cost(%d fds)=%fticks(%f seconds)\n",
NUMFDS,tix,sex);
}
int main(int argc, char **argv)
{
pid_t pid;
int i;
__register_atfork(clk_start,clk_end,NULL,NULL);
for (i = 0; i < NUMFDS; i++) {
fds[i] = open("/dev/null",O_RDONLY);
if (fds[i] == -1) {
cleanup(i);
errx(EXIT_FAILURE,"open_fds:");
}
}
t1 = clock();
pid = fork();
if (pid < 0) {
errx(EXIT_FAILURE,"fork:");
}
if (pid == 0) {
cleanup(NUMFDS);
exit(0);
} else {
wait(&i);
cleanup(NUMFDS);
}
exit(0);
return 0;
}
ofcourse, can't consider this a real bench but anyhow:
root@pinkpony:/home/cia/dev/kqueue# time ./forkit
fork_cost(100 fds)=0.000000ticks(0.000000 seconds)
real 0m0.004s
user 0m0.000s
sys 0m0.000s
root@pinkpony:/home/cia/dev/kqueue# gcc -DNUMFDS=100000 -o forkit forkit.c
root@pinkpony:/home/cia/dev/kqueue# time ./forkit
fork_cost(100000 fds)=10000.000000ticks(0.010000 seconds)
real 0m0.287s
user 0m0.010s
sys 0m0.240s
root@pinkpony:/home/cia/dev/kqueue# gcc -DNUMFDS=100 -o forkit forkit.c
root@pinkpony:/home/cia/dev/kqueue# time ./forkit
fork_cost(100 fds)=0.000000ticks(0.000000 seconds)
real 0m0.004s
user 0m0.000s
sys 0m0.000s
forkit ran on a Dell Inspiron 1520 Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz with 4GB ram; average_load=0.00
If you
fork
with the purpose of calling anexec
function, you can usefcntl
withFD_CLOEXEC
to have the file descriptor closed once youexec
:Such a file descriptor will survive a
fork
but not functions of theexec
family.There's no standard way of doing this to my knowledge.
If you're looking to implement it properly, probably the best way to do it would be to add a system call to mark the file descriptor as close-on-fork, and to intercept the
sys_fork
system call (syscall number 2) to act on those flags after calling the originalsys_fork
.If you don't want to add a new system call, you might be able to get away with intercepting
sys_ioctl
(syscall number 54) and just adding a new command to it for marking a file description close-on-fork.Of course, if you can control what your application is doing, then it might be better to maintain user-level tables of all file descriptors you want closed on fork and call your own
myfork
instead. This would fork, then go through the user-level table closing those file descriptors so marked.You wouldn't have to fiddle around in the Linux kernel then, a solution that's probably only necessary if you don't have control over the fork process (say, if a third party library is doing the
fork()
calls).No. Close them yourself, since you know which ones need to be closed.