I am playing with ampersand “&”
. I understand that in bash shell script the
ampersand is used to fork processes but will run in the background. This is useful because it allows you to get your prompt back immediately, and run the process in the background.
Please observe the following code:
#include <stdio.h>
#include <unistd.h>
int x=5;
void main()
{
int pid = getpid();
int y=6;
printf("[%d] [%p] x = %d\n", pid, &x, x++);
printf("[%d] [%p] y = %d\n", pid, &y, y++);
}
After successful compilation, I run the code with:
> ./a.out & ./a.out & ./a.out
Output of first run:
[4436] [0x601058] x = 5
[4435] [0x601058] x = 5
[4436] [0x7fff2d481bd8] y = 6
[4435] [0x7fff7ecadd88] y = 6
[4437] [0x601058] x = 5
[4437] [0x7fff6e0741d8] y = 6
Output of second run:
[4469] [0x601058] x = 5
[4469] [0x7fffa00048b8] y = 6
[4470] [0x601058] x = 5
[4470] [0x7fffd447a798] y = 6
[4468] [0x601058] x = 5
[4468] [0x7fffc35dc7b8] y = 6
Observations:
- Some print statements appear in different order because each process is running simultaneously.
- Address of variable x is the same on all instances because it’s a global variable.
- Value of x is the same in all instances because it gets reset every time to 5.
- Variable y is only local to main(), therefore its address will be unique in each process.
Here are my questions:
- The reason why some print statements appear in different order are determined by which process was started first by the OS scheduler?
- Since variable x is global and seems to keep the same address in all runs/instances. Why isn’t its value shared among processes after the auto increment? Why doesn’t ANY process print an incremented value of x?
Each process has its own address space in virtual memory (thanks to the processor's MMU). So variable x
is not global to your 3 processes; every process has its own x
; so address 0x601058 (the printed address of x
) in process 4436 is not the same location as the "same" address 0x601058 in process 4435.
So (virtual) memory is specific to every process. A process can change its address space using mmap(2). You could use some advanced techniques to set up some shared memory between several processes (but learn more some Linux programming before). See shm_overview(7) & sem_overview(7). You should not (as a newbie) want to use shared memory, because of synchronization issues.
Read Advanced Linux Programming, it has several chapters related to your questions.
A multi-threaded process has several threads sharing the same address space (and other things like current directory, opened file descriptors, etc...). Read also some POSIX thread (a.k.a pthread) tutorial. Each thread has its own call stack.
Notice that addresses might not be reproducible from one run to the next one, because of ASLR.
The Linux kernel has a scheduler working on tasks. A scheduled task is either a thread or a (single-threaded) process. The scheduler may preempt tasks at arbitrary moments, and on a multi-core processor you may have several tasks running in parallel (on different cores).
You could also play (on Linux) with proc(5). If you let your processes sleep e.g. 10 seconds, you could type (e.g. in a different terminal) cat /proc/4436/maps
while your process 4436 is still running (or asleep).
You might also play with strace(1), maybe try strace a.out
to see the relevant syscalls(2).
Of course, read several times the documentation of fork(2) and execve(2)
Since the bash shell is free software, you could study its source code. It does call fork
a lot!