I know the relationship between the two:
virtual address mod page alignment == file offset mod page alignment
But can someone tell me in which direction are these two numbers computed?
Is virtual address computed from file offset according to the relationship above, or vice versa?
Update
Here is some more detail: when the linker writes the ELF file header, it sets the virtual address and file offset of the program headers.(segments)
For example there's the output of readelf -l someELFfile
:
Elf file type is EXEC (Executable file)
Entry point 0x8048094
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x08048000 0x08048000 0x00154 0x00154 R E 0x1000
LOAD 0x000154 0x08049154 0x08049154 0x00004 0x00004 RW 0x1000
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10
We can see 2 LOAD segments.
The virtual address of the first LOAD ends at 0x8048154, while the second LOAD starts at 0x8049154.
In the ELF file, the second LOAD is right behind the first LOAD with file offset 0x00154, however when this ELF is loaded into memory it starts at 0x1000 bytes after the end of the first LOAD segment.
But, why? If we have to consider memory page alignment, why doesn't the second LOAD segment starts at 0x80489000? Why does it start at 0x1000 bytes AFTER THE END of the first LOAD segment?
I know the virtual address of the second LOAD satisfies the relationship:
virtual address mod page alignment == file offset mod page alignment
But I don't know why this relationship must be satisfied.
If it didn't, it would have to start at
0x08048154
, but it can't: the twoLOAD
segments have different flags specified for their mapping (the first is mapped withPROT_READ|PROT_EXEC
, the second withPROT_READ|PROTO_WRITE
. Protections (being part of the page table) can only apply to whole pages, not parts of a page. Therefore, the mappings with different protections must belong to different pages.The
LOAD
segments are directlymmap
ed from file. The actual mapping of the secondLOAD
segment performed for your example will look something like this (you can run your program understrace
and see that it does):If you try to make the virtual address or the offset non-page-aligned,
mmap
will fail withEINVAL
. The only way to make file data to appear in virtual memory at desired address it to makeVirtAddr
congruent toOffset
moduloAlign
, and that is exactly what the static linker does.Note that for such a small first
LOAD
segment, the entire first segment also appears at the beginning of the second mapping (with the wrong protections). But the program is not supposed to access anything in the[0x08049000,0x08049154)
range. In general, it is almost always the case that there is some "junk" before the start of actual data in the secondLOAD
segment (unless you get really lucky and the firstLOAD
segment ends on a page boundary).See also mmap man page.