As part of trying to write a compiler completely from scratch, I'm currently working on the part the handles ELF files.
After skimming through several articles and specifications about them, I still don't quite understand where section to segment mappings are stored.
When observing small executables generated by NASM+ld, I can see that the .text section is somehow mapped onto a LOAD-type program header, but how?
A small piece of readelf's output when given a small (working) executable as input:
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000000084 0x0000000000000084 R E 200000
Section to Segment mapping:
Segment Sections...
00 .text
Is this mapping even required to have a working executable? Or can they be omitted completely and you would still have a valid executable?
I still don't quite understand where section to segment mappings are stored.
They are not stored anywhere.
Rather, readelf
computes the mapping by looking at file offset and size of sections and segments.
I did a test according to the @Employed Russian.
readelf -l ./libandroid_servers.so
Elf file type is DYN (Shared object file)
Entry point 0x0
There are 6 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x00000034 0x00000034 0x000c0 0x000c0 R 0x4
LOAD 0x000000 0x00000000 0x00000000 0x0f830 0x0f830 R E 0x1000
LOAD 0x010000 0x00010000 0x00010000 0x00cf4 0x011ac RW 0x1000
DYNAMIC 0x010540 0x00010540 0x00010540 0x00130 0x00130 RW 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0
EXIDX 0x00f2e8 0x0000f2e8 0x0000f2e8 0x00548 0x00548 R 0x4
Section to Segment mapping:
Segment Sections...
00
01 .hash .dynsym .dynstr .rel.plt .rel.dyn .plt .text .rodata .ARM.extab .ARM.exidx
02 .init_array .fini_array .data.rel.ro .dynamic .got .data .bss
03 .dynamic
04
05 .ARM.exidx
01 LOAD offset: 0x000000 fileSize 0x0f830
.ARM.exidx section end addr: hex(0xF2E8 + 1352) = 0xf830
02 LOAD offset: 0x010000 fileSize: 0x00cf4
.init_array section begin addr: 10000h
.bss section end addr: hex(0x10cf4 + 0 ) = 0x10cf4
You see the readelf
surely print the sections in a segments by computes. They match well.