Is there any way we could say which glibc function is called from ARM elf binary? For example, consider the following diassembly:
8300 <printf@plt-0x40>:
....
8320: e28fc600 add ip, pc, #0, 12
8324: e28cca08 add ip, ip, #8, 20 ; 0x8000
8328: e5bcf344 ldr pc, [ip, #836]! ; 0x344
....
83fc <main>:
...
8424:ebffffbd bl 8320 <_init+0x2c>
Here, how we can say that bl 8320 is a call to printf? Is this information stored in ELF binary somewhere?
Is there any way we could say which glibc function is called from ARM elf binary?
Not really.
You could trivially ask "what external functions are called by a binary", like so:
nm -D a.out | grep ' U '
Which library the undefined functions are defined in is not recorded, and can in fact change. For example, you could use LD_PRELOAD=libfoo.so
to inject a different printf
defined in libfoo.so
, and preempt the glibc
definition of printf
.
TLDR: You have to compute the address of the GOT entry (stored in IP but the PLT) and find the relocation entry corresponding to this GOT entry. This relocation entry references the symbol name (through the dynamic symbol table and the dynamic string table).
Your example
This PLT entry computes the address of a PLTGOT entry in the IP register:
8320: e28fc600 add ip, pc, #0, 12
8324: e28cca08 add ip, ip, #8, 20 ; 0x8000
8328: e5bcf344 ldr pc, [ip, #836]! ; 0x344
This computes the GOT entry of address: 0x8320 + 0x8 + 0x8000 + 0x344 = 0x1066c. There is a relocation entry in the relocation table which binds this GOT entry to a given symbol.
Another example
Let's take this PLT entry from my libc:
00015b98 :
15b98: e28fc601 add ip, pc, #1048576 ; 0x100000
15b9c: e28cca2f add ip, ip, #192512 ; 0x2f000
15ba0: e5bcf46c ldr pc, [ip, #1132]! ; 0x46c
The address of the GOT entry is: 0x15b98 + 0x8 + 0x100000 + 0x2f000 + 0x46c = 0x14500c.
If you want to know why "+ 0x8", this is because:
In ARM state, the value of the PC is the address of the current
instruction plus 8 bytes.
Let's look at the relocation entry:
Offset Info Type Sym.Value Sym. Name
0014500c 0001e416 R_ARM_JUMP_SLOT 00077c28 realloc
So this PLT entry is a PLT to realloc
which is what we expected to get! \o/
Findinf the symbol name
You might want to know how the symbol name is found. In my example, the info field is 0x0001e416: this relocation uses the symbol entry 0x1e4 = 484 in the dynamic symbol table (.dynsym
)
Num: Value Size Type Bind Vis Ndx Name
484: 00077c28 760 FUNC GLOBAL DEFAULT 11 realloc@@GLIBC_2.4
In fact, the realloc
string is not found in the symbol table directly but in the string table (.dynstr
). The symbol table stores the offset of the string within the string table.