First, the toy program I'm playing with:
prog.c:
int func1();
int main(int argc, char const *argv[])
{
func1();
return 0;
}
lib.c:
int func1()
{
return 0;
}
Build with:
gcc -O3 -g -shared -fpic ./lib.c -o liba.so
gcc prog.c -g -la -L. -o prog -Wl,-rpath=$PWD
And for completness:
$ gcc --version
gcc (GCC) 6.3.1
$ ld --version
GNU ld version 2.26.1
Now, my question. I've confirmed what I've read about how lazy binding of dynamic symbols works, i.e.
that initially the GOT entry for func1
points right back at the PLT, to the instruction following the jump:
$ gdb prog
(gdb) disassemble main
Dump of assembler code for function main:
0x0000000000400666 <+0>: push rbp
0x0000000000400667 <+1>: mov rbp,rsp
0x000000000040066a <+4>: sub rsp,0x10
0x000000000040066e <+8>: mov DWORD PTR [rbp-0x4],edi
0x0000000000400671 <+11>: mov QWORD PTR [rbp-0x10],rsi
0x0000000000400675 <+15>: mov eax,0x0
0x000000000040067a <+20>: call 0x400560 <func1@plt> <<< call to shared lib via PLT
0x000000000040067f <+25>: mov eax,0x0
0x0000000000400684 <+30>: leave
0x0000000000400685 <+31>: ret
End of assembler dump.
(gdb) disassemble 0x400560
Dump of assembler code for function func1@plt:
0x0000000000400560 <+0>: jmp QWORD PTR [rip+0x200ab2] # 0x601018 <<< JMP to address stored in GOT
0x0000000000400566 <+6>: push 0x0 <<< ... which initially points right back here
0x000000000040056b <+11>: jmp 0x400550
End of assembler dump.
(gdb) x/g 0x601018
0x601018: 0x400566 <<< GOT point back right after the just-executed jump
This is fine. Now, examining the .got.plt section at 0x601018
, which contains this pointer back to 0x400566
shows that the binary itself holds the address, without the dynamic linker doing anything at load--time. Which makes sense, since this address is known at link time:
$ readelf -x .got.plt ./prog
Hex dump of section '.got.plt':
NOTE: This section has relocations against it, but these have NOT been applied to this dump.
0x00601000 ???????? ???????? ???????? ???????? ..`.............
0x00601010 ???????? ???????? 66054000 ???????? ........f.@.....
(where 66054000
is the little-endian rep. for the address 0x400566
stored in the GOT initially)
Fine, fine. But then...
$ readelf -r prog
<...>
Relocation section '.rela.plt' at offset 0x518 contains 1 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000601018 000100000007 R_X86_64_JUMP_SLO 0000000000000000 func1 + 0
Why is there a relocation entry for this GOT address? we've seen that the correct address is already there, in the binary, placed there at link time. I've also experimentally edited this relocation to make it ineffectual, and the program runs fine. So the reloc seems to contribute nothing. What is it doing there, and more generally what are the scenarios in which the relocations in rela.plt
actually do matter?
Update #1:
To be clear, this is about PIC code and 64bit code. Here are the relevant section addresses, to help clarify where the addresses belong to:
$ readelf -S ./prog
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 9] .rela.dyn RELA 00000000004004e8 000004e8
0000000000000030 0000000000000018 A 5 0 8
[10] .rela.plt RELA 0000000000400518 00000518
0000000000000018 0000000000000018 AI 5 23 8
[12] .plt PROGBITS 0000000000400550 00000550
0000000000000020 0000000000000010 AX 0 0 16
[21] .dynamic DYNAMIC 0000000000600e00 00000e00
00000000000001f0 0000000000000010 WA 6 0 8
[22] .got PROGBITS 0000000000600ff0 00000ff0
0000000000000010 0000000000000008 WA 0 0 8
[23] .got.plt PROGBITS 0000000000601000 00001000
0000000000000020 0000000000000008 WA 0 0 8
Update #2:
Editing the section header for .rela.plt
doesn't change the process image, so I did not disable the reloc.
I had also tried changing the reloc address (to a another writable address) and that didn't seem to make a difference either, but it turns out the address isn't used by the resolver during lazy binding, although the rest of the reloc is. The address itself is only used if lazy binding is turned off with LD_BIND_NOW
.
Thanks @yugr