Why does the linker generate seemingly useless rel

2019-06-22 06:56发布

问题:

First, the toy program I'm playing with:

prog.c:

int func1();

int main(int argc, char const *argv[])
{
    func1();
    return 0;
}

lib.c:

int func1()
{
    return 0;
}

Build with:

gcc -O3 -g -shared -fpic ./lib.c -o liba.so
gcc prog.c -g -la -L. -o prog -Wl,-rpath=$PWD

And for completness:

$ gcc --version
gcc (GCC) 6.3.1 

$ ld --version
GNU ld version 2.26.1

Now, my question. I've confirmed what I've read about how lazy binding of dynamic symbols works, i.e. that initially the GOT entry for func1 points right back at the PLT, to the instruction following the jump:

$ gdb prog    

(gdb) disassemble main
Dump of assembler code for function main:
   0x0000000000400666 <+0>: push   rbp
   0x0000000000400667 <+1>: mov    rbp,rsp
   0x000000000040066a <+4>: sub    rsp,0x10
   0x000000000040066e <+8>: mov    DWORD PTR [rbp-0x4],edi
   0x0000000000400671 <+11>:    mov    QWORD PTR [rbp-0x10],rsi
   0x0000000000400675 <+15>:    mov    eax,0x0
   0x000000000040067a <+20>:    call   0x400560 <func1@plt>    <<< call to shared lib via PLT
   0x000000000040067f <+25>:    mov    eax,0x0
   0x0000000000400684 <+30>:    leave  
   0x0000000000400685 <+31>:    ret    
End of assembler dump.
(gdb) disassemble 0x400560
Dump of assembler code for function func1@plt:
   0x0000000000400560 <+0>: jmp    QWORD PTR [rip+0x200ab2]        # 0x601018 <<< JMP to address stored in GOT
   0x0000000000400566 <+6>: push   0x0             <<< ... which initially points right back here
   0x000000000040056b <+11>:    jmp    0x400550
End of assembler dump. 
(gdb) x/g 0x601018
0x601018:   0x400566   <<< GOT point back right after the just-executed jump

This is fine. Now, examining the .got.plt section at 0x601018, which contains this pointer back to 0x400566 shows that the binary itself holds the address, without the dynamic linker doing anything at load--time. Which makes sense, since this address is known at link time:

$ readelf -x .got.plt ./prog 

Hex dump of section '.got.plt':
 NOTE: This section has relocations against it, but these have NOT been applied to this dump.
  0x00601000 ???????? ???????? ???????? ???????? ..`.............
  0x00601010 ???????? ???????? 66054000 ???????? ........f.@.....

(where 66054000 is the little-endian rep. for the address 0x400566 stored in the GOT initially)

Fine, fine. But then...

$ readelf -r prog                             
<...>
Relocation section '.rela.plt' at offset 0x518 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000601018  000100000007 R_X86_64_JUMP_SLO 0000000000000000 func1 + 0

Why is there a relocation entry for this GOT address? we've seen that the correct address is already there, in the binary, placed there at link time. I've also experimentally edited this relocation to make it ineffectual, and the program runs fine. So the reloc seems to contribute nothing. What is it doing there, and more generally what are the scenarios in which the relocations in rela.plt actually do matter?

Update #1:

To be clear, this is about PIC code and 64bit code. Here are the relevant section addresses, to help clarify where the addresses belong to:

$ readelf -S ./prog
Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 9] .rela.dyn         RELA             00000000004004e8  000004e8
       0000000000000030  0000000000000018   A       5     0     8
  [10] .rela.plt         RELA             0000000000400518  00000518
       0000000000000018  0000000000000018  AI       5    23     8
  [12] .plt              PROGBITS         0000000000400550  00000550
       0000000000000020  0000000000000010  AX       0     0     16
  [21] .dynamic          DYNAMIC          0000000000600e00  00000e00
       00000000000001f0  0000000000000010  WA       6     0     8
  [22] .got              PROGBITS         0000000000600ff0  00000ff0
       0000000000000010  0000000000000008  WA       0     0     8
  [23] .got.plt          PROGBITS         0000000000601000  00001000
       0000000000000020  0000000000000008  WA       0     0     8

Update #2:

Editing the section header for .rela.plt doesn't change the process image, so I did not disable the reloc. I had also tried changing the reloc address (to a another writable address) and that didn't seem to make a difference either, but it turns out the address isn't used by the resolver during lazy binding, although the rest of the reloc is. The address itself is only used if lazy binding is turned off with LD_BIND_NOW.

Thanks @yugr

回答1:

This is fine. Now, examining the .got.plt section at 0x601018, which contains this pointer back to 0x400566 shows that the binary itself holds the address, without the dynamic linker doing anything at load--time.

Not really. Note that code at 0x400566 ends up jumping to 0x400550 i.e. 16 bytes prior to PLT stub. The code at 0x400550 will push address of GOT on stack and call into dynamic linker. For more info take a look at this presentation (slide 14).

Which makes sense, since this address is known at link time

Is it? The address will come from shared library which will be loaded at random address at startup due to ASLR so there is no way for static linker to know the address...

Why is there a relocation entry for this GOT address?

When PLT stub calls into dynamic linker (on first call), it passes it the address of GOT entry. Dynamic linker will search .rela.plt to find out how to relocate GOT entry (i.e. function name and offset).