In my application, I have a process which forks off a child, say child1, and this child process writes a huge binary file on the disk and exits. The parent process then forks off another child process, child2, which reads in this huge file to do further processing.
The file dumping and re-loading is making my application slow and I'm thinking of possible
ways of avoiding disk I/O completely. Possible ways I have identified are ram-disk or tmpfs.
Can I somehow implement ram-disk or tmpfs from within my application? Or is there any other
way by which I can avoid disk I/O completely and send data across processes reliably.
If the two sub-processes do not run at the same time pipes or sockets won't work for you – their buffers would be too small for the 'huge binary file' and the first process will block waiting for anything for reading the data.
In such case you rather need some kind of shared memory. You can use the SysV IPC shared memory API, POSIX shared memory API (which internally uses tmpfs on recent Linux) or use files on a tmpfs (usually mounted on /dev/shm, sometimes on /tmp) file system directly.
Create an anonymous shared memory region before forking and then all children can use it after the fork:
char *shared = mmap(0,size,PROT_READ|PROT_WRITE,MAP_SHARED|MAP_ANONYMOUS,-1,0);
Be aware that you'll need some synchronization mechanism when sharing memory. One way to accomplish this is to put a mutex or semaphore inside the shared memory region.
A named pipe is exactly what you want. You can write data into it and read data from it like it was a file, but there's no need to store it on disk.
You can use pipes, sockets, and take advantage of sendfile()
or splice()
features of Linux kernel (they can avoid data copying).
Spawn the two processes and have them transfer the data via sockets. TCP will be easiest to get started, but if you want a bit more efficiency, use Unix Domain Sockets. This assumes you don't care about the data being written to disk per se.
You can pass data between processes, using pipes. Here is a good synopsis and example implementation.
As in your case 1st child process child1 is exiting before child2 comes in existence so socket communication or using un-named pipes will not help,
But shared memory will do the job:
Create a shared memory segment with read permission for all in child1 and do the file dumping task in that shared memory,
In child2 attach the shared memory segment to current process space and read the dumped data.