Keep getting e8 00 00 00 00 as the machine code to

2019-02-20 00:58发布

问题:

I know when using objdump -dr in my file call shows up in machine code as e8 00 00 00 00 because it has not yet been linked. But I need to find out what the 00 00 00 00 will turn into after the linker has done it's job. I know it should calculate the offset, but I'm a little confused about that.

As an example with the code below, after the linker part is done, how should the e8 00 00 00 00 be? And how do I get to that answer?

I'm testing out with this sample code: (I'm trying to call moo)

Disassembly of section .text:

0000000000000000 <foo>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   89 7d fc                mov    %edi,-0x4(%rbp)
   7:   8b 45 fc                mov    -0x4(%rbp),%eax
   a:   83 e8 0a                sub    $0xa,%eax
   d:   5d                      pop    %rbp
   e:   c3                      retq   

000000000000000f <moo>:
   f:   55                      push   %rbp
  10:   48 89 e5                mov    %rsp,%rbp
  13:   89 7d fc                mov    %edi,-0x4(%rbp)
  16:   b8 01 00 00 00          mov    $0x1,%eax
  1b:   5d                      pop    %rbp
  1c:   c3                      retq   

000000000000001d <main>:
  1d:   55                      push   %rbp
  1e:   48 89 e5                mov    %rsp,%rbp
  21:   48 83 ec 10             sub    $0x10,%rsp
  25:   c7 45 fc 8e 0c 00 00    movl   $0xc8e,-0x4(%rbp)
  2c:   8b 45 fc                mov    -0x4(%rbp),%eax
  2f:   89 c7                   mov    %eax,%edi
  31:   e8 00 00 00 00          callq  36 <main+0x19>
            32: R_X86_64_PC32   moo-0x4
  36:   89 45 fc                mov    %eax,-0x4(%rbp)
  39:   b8 00 00 00 00          mov    $0x0,%eax
  3e:   c9                      leaveq 
  3f:   c3                      retq

回答1:

With objdump -r you have Relocations printed with your disassembly -d:

  31:   e8 00 00 00 00          callq  36 <main+0x19>
            32: R_X86_64_PC32   moo-0x4

ld-linux.so.2 loader will relocate objects (in modern world it will relocate even executable to random address) and fill the relocations with correct address.

Check with gdb by adding breakpoint at main and starting program (linker works before main function is started):

gdb ./program
(gdb) start
(gdb) disassemble main

If you want to compile the code without relocations, show source code and compilation options.



回答2:

Object files and executable files on several architectures that I know of do not necessarily fix jump destinations at link time.

This is a feature which provides flexibility.

Jump target addresses do not have to be fixed until just before the instruction executes. They do not need to be fixed up at link time—nor even at program start time!

Most systems (Windows, Linux, Unix, VAX/VMS) tag such locations in the object code as an address which needs adjustment. There is additional information about what the target address is, what type of reference it is (such as absolute or relative; 16-bit, 24-bit, 32-bit, 64-bit, etc.).

The zero value there is not necessarily a placeholder, but the base value upon which to evaluate the result. For example, if the instruction were—for whatever reason—call 5+external_address, then there might be 5 (e8 05 00 00 00) in the object code.

If you want to see what the address is at execution time, run the program under a debugger, place a breakpoint at that instruction and then view the instruction just before it executes.


A common anti-virus, security-enhancing feature known as ASLR (address space layout randomization) intentionally loads programs sections at inconsistent addresses to thwart malicious code which alters programs or data. Programs operating in this environment may not have some target addresses assigned until after the program runs a bit.

(Of related interest, VAX/VMS in particular has a complex fixup mode in which an equation describes the operations needed to compute a value. Operations include addition, subtraction, multiplication, division, shifting, rotating, and probably others. I never saw it actually used, but it was interesting to contemplate how one might apply the capability.)



回答3:

but you clearly know how to do all of this. you know how to disassemble before linking just disassemble after to see how the linker modifies those instructions.

asm(".globl _start; _start: nop\n");

unsigned int foo ( unsigned int x )
{
    return(x+5);
}
unsigned int moo ( unsigned int x )
{
    return(foo(x)+3);
}

int main ( void )
{
    return(moo(3)+2);
}

0000000000000000 <_start>:
   0:   90                      nop

0000000000000001 <foo>:
   1:   55                      push   %rbp
   2:   48 89 e5                mov    %rsp,%rbp
   5:   89 7d fc                mov    %edi,-0x4(%rbp)
   8:   8b 45 fc                mov    -0x4(%rbp),%eax
   b:   83 c0 05                add    $0x5,%eax
   e:   5d                      pop    %rbp
   f:   c3                      retq   

0000000000000010 <moo>:
  10:   55                      push   %rbp
  11:   48 89 e5                mov    %rsp,%rbp
  14:   48 83 ec 08             sub    $0x8,%rsp
  18:   89 7d fc                mov    %edi,-0x4(%rbp)
  1b:   8b 45 fc                mov    -0x4(%rbp),%eax
  1e:   89 c7                   mov    %eax,%edi
  20:   e8 00 00 00 00          callq  25 <moo+0x15>
  25:   83 c0 03                add    $0x3,%eax
  28:   c9                      leaveq 
  29:   c3                      retq   

000000000000002a <main>:
  2a:   55                      push   %rbp
  2b:   48 89 e5                mov    %rsp,%rbp
  2e:   bf 03 00 00 00          mov    $0x3,%edi
  33:   e8 00 00 00 00          callq  38 <main+0xe>
  38:   83 c0 02                add    $0x2,%eax
  3b:   5d                      pop    %rbp
  3c:   c3                      retq   


0000000000001000 <_start>:
    1000:   90                      nop

0000000000001001 <foo>:
    1001:   55                      push   %rbp
    1002:   48 89 e5                mov    %rsp,%rbp
    1005:   89 7d fc                mov    %edi,-0x4(%rbp)
    1008:   8b 45 fc                mov    -0x4(%rbp),%eax
    100b:   83 c0 05                add    $0x5,%eax
    100e:   5d                      pop    %rbp
    100f:   c3                      retq   

0000000000001010 <moo>:
    1010:   55                      push   %rbp
    1011:   48 89 e5                mov    %rsp,%rbp
    1014:   48 83 ec 08             sub    $0x8,%rsp
    1018:   89 7d fc                mov    %edi,-0x4(%rbp)
    101b:   8b 45 fc                mov    -0x4(%rbp),%eax
    101e:   89 c7                   mov    %eax,%edi
    1020:   e8 dc ff ff ff          callq  1001 <foo>
    1025:   83 c0 03                add    $0x3,%eax
    1028:   c9                      leaveq 
    1029:   c3                      retq   

000000000000102a <main>:
    102a:   55                      push   %rbp
    102b:   48 89 e5                mov    %rsp,%rbp
    102e:   bf 03 00 00 00          mov    $0x3,%edi
    1033:   e8 d8 ff ff ff          callq  1010 <moo>
    1038:   83 c0 02                add    $0x2,%eax
    103b:   5d                      pop    %rbp
    103c:   c3                      retq   

for example

20:   e8 00 00 00 00        callq  25 <moo+0x15>
1033: e8 d8 ff ff ff        callq  1010 <moo>