I am confused about some topics regarding virtual memory. So, i am going to pointwise list them and ask questions. While answering i will prefer if you also list some source where i can clear that doubt. I will be talking with reference to a linux elf executable file.
I have heard that every process has the address space of 4gb in a 32 bit system. When i checked the objdump of one of my executable relocatable file i saw that it had limits from 00000000 to ffffffff. Also it contained the kernel space. This is the address space of the file. Is this the virtual memory we talk about? If yes then i had read that virtual memory mechanism allows processes of very big sizes to run and that process size is not limited by main memory size(we can bring required pages to main memory upon demand- demand paging). Then if virtual memory is just 4gb, doesn't it limit the maximum size of programs to 4gb? Also, i checked another file's objdump and it had the same address(i.e. 00000000 to ffffffff). So, what does this mean? Does that mean that our file is some kind of relocatable file to which starting addresses will again be added(although this seems absurd because it is already an executable relocatable object file).
I had read that in a memory where segmentation has been implemented, the cpu produces virtual(logical) address. This address has three parts - the segment, the offset within the segment. Also, the segments being talked about here are code, data, stack etc.
In the process address space, these segments are located starting from specific locations. So, what are the contents of the virtual address of the cpu? Does the virtual address produced range from 00000000 to ffffffff? If yes then is the process of accessing the content at the virtual address, the folllowing:-
The segment part is looked up in the segment descriptor table to find the segment's starting address in linear address space. Then the offset is indexed within the segment and the resulting address is the linear address. Then, we look up the page table and map the address to physical address. If the page is not currently in the main memory, it is brought.
This again arises the fact that no process can be fully in the main memory at any time, because then entire memory will be occupied by just one process(as the address space of process is itself 4gb).
Also, if all process have address space from 00000000 to ffffffff, and more than one process can exist in main memory at a time, then all processes should have their own segment descriptor table which returns the segment's address in the linear address space
- I read that the operating system is loaded into the main memory at boot up. Then what is the difference between that OS and the kernel code in the kernel space of a particular process? Also, do all processes have their own copy of kernel code in their kernel space?
This is a very open-ended question that has many confused uses of different terms. I'll try to address as much of your question as I can, and provide some other useful information that may help.
"I have heard that every process has the address space of 4gb in a 32 bit system." Not precisely true. Every process has a maximum addressable space of 3.2GB in a 32-bit system. That doesn't mean that this memory is ever allocated, and it certainly isn't allocated as soon as a process launches. "Is this the virtual memory we talk about?" No. Virtual memory is nothing directly to do with the addressable space of a process. More on this later.
This question doesn't really make sense, for reasons I will explain below. It's worth noting, though, that multiple processes clealy do fit in memory at one time, because the processes don't automatically allocate their full potentially-available memory. (If a text editor allocated 4GB of memory as soon as it was opened, it would not be a popular text editor!)
I'm no expert, but I highly doubt that every program has its own copy of kernel code at runtime. The security and performance issues alone make this a very unlikely solution.
So now, some definitions that may help you.
00000000
tofffffff
range is the logical range available to the process, and this is the address that will be used within the process to reference memory. The kernel will translate this to a physical address that is used by the CPU when actually executing code, based on the physical offset (and segmentation) of the process' memory. This physical location could be located anywhere in the available memory space and, if the aplication is paged out and in, the physical location may change during the lifetime of the application. However, the application itself need only ever refer to its own logical address space. The term "logical" versus "physical" address is used to highlight that an address is not the real address, but is the address relative to the relevant subset of memory - that is, to the process' own memory space.I'm no expert on this, but I hope this helps clarify some of your questions.