To study how the object file loaded and run in linux, I made the simplest c code, file name simple.c.
int main(){}
Next, I make object file and save object file as text file.
$gcc ./simple.c
$objdump -xD ./a.out > simple.text
From many internet articles, I could catch that gcc dynamically load initiating functions like _start, _init, __libc_start_main@plt, and so on. So I started to read my assembly code, helped by http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html .
Here is the some part of assembly code.
080482e0 <__libc_start_main@plt>:
80482e0: ff 25 10 a0 04 08 jmp *0x804a010
80482e6: 68 08 00 00 00 push $0x8
80482eb: e9 d0 ff ff ff jmp 80482c0 <_init+0x2c>
Disassembly of section .text:
080482f0 <_start>:
80482f0: 31 ed xor %ebp,%ebp
80482f2: 5e pop %esi
80482f3: 89 e1 mov %esp,%ecx
80482f5: 83 e4 f0 and $0xfffffff0,%esp
80482f8: 50 push %eax
80482f9: 54 push %esp
80482fa: 52 push %edx
80482fb: 68 70 84 04 08 push $0x8048470
8048300: 68 00 84 04 08 push $0x8048400
8048305: 51 push %ecx
8048306: 56 push %esi
8048307: 68 ed 83 04 08 push $0x80483ed
804830c: e8 cf ff ff ff call 80482e0 <__libc_start_main@plt>
8048311: f4 hlt
8048312: 66 90 xchg %ax,%ax
8048314: 66 90 xchg %ax,%ax
8048316: 66 90 xchg %ax,%ax
8048318: 66 90 xchg %ax,%ax
804831a: 66 90 xchg %ax,%ax
804831c: 66 90 xchg %ax,%ax
804831e: 66 90 xchg %ax,%ax
080483ed <main>:
80483ed: 55 push %ebp
80483ee: 89 e5 mov %esp,%ebp
80483f0: b8 00 00 00 00 mov $0x0,%eax
80483f5: 5d pop %ebp
80483f6: c3 ret
80483f7: 66 90 xchg %ax,%ax
80483f9: 66 90 xchg %ax,%ax
80483fb: 66 90 xchg %ax,%ax
80483fd: 66 90 xchg %ax,%ax
80483ff: 90 nop
...
Disassembly of section .got:
08049ffc <.got>:
8049ffc: 00 00 add %al,(%eax)
...
Disassembly of section .got.plt:
0804a000 <_GLOBAL_OFFSET_TABLE_>:
804a000: 14 9f adc $0x9f,%al
804a002: 04 08 add $0x8,%al
...
804a00c: d6 (bad)
804a00d: 82 (bad)
804a00e: 04 08 add $0x8,%al
804a010: e6 82 out %al,$0x82
804a012: 04 08 add $0x8,%al
My question is;
In 0x804830c, 0x80482e0 is called (I've already apprehended the previous instructions.).
In 0x80482e0, the process jump to 0x804a010.
In 0x804a010, the instruction is < out %al,$0x82 >
...wait. just out? What was in the %al and where is 0x82?? I got stuck in this line.
Please help....
*p.s. I'm beginner to linux and operating system. I'm studying operating system concepts by school class, but still can not find how to study proper linux assembly language. I've already downloaded intel processor manual but it is too huge to read. Can anyone inform me good material for me? Thanks.
This means "retrieve the 4-byte address stored at 0x804a010 and jump to it."
Those 4 bytes will be treated as an address, 0x80482e6, not as instructions.
So we've just executed an instruction that has moved us exactly one instruction forward. At this point, you're probably wondering if there's a good reason for this.
There is. This is a typical PLT/GOT implementation. Much more detail, including a diagram, is at Position Independent Code in shared libraries: The Procedure Linkage Table.
The real code for
__libc_start_main
is in a shared library,glibc
. The compiler and compile-time linker don't know where the code will be at run-time, so they place in your compiled program a short__libc_start_main
function which contains just three instructions:The first time you call
__libc_start_main
, the resolver code will run. It will find the actual location of__libc_start_main
in a shared library and will patch the 4th entry of the GOT to be that address. If your program calls__libc_start_main
again, thejmp *0x804a010
instruction will take the program directly to the code in the shared library.The x86 Assembly book at Wikibooks might be one place to start.