How to determine which glibc function is called fr

2019-09-12 11:10发布

问题:

Is there any way we could say which glibc function is called from ARM elf binary? For example, consider the following diassembly:

8300 <printf@plt-0x40>:
   ....
8320:   e28fc600    add ip, pc, #0, 12
8324:   e28cca08    add ip, ip, #8, 20  ; 0x8000
8328:   e5bcf344    ldr pc, [ip, #836]! ; 0x344

   ....
83fc <main>:
   ...
8424:ebffffbd   bl  8320 <_init+0x2c>

Here, how we can say that bl 8320 is a call to printf? Is this information stored in ELF binary somewhere?

回答1:

Is there any way we could say which glibc function is called from ARM elf binary?

Not really.

You could trivially ask "what external functions are called by a binary", like so:

nm -D a.out | grep ' U '

Which library the undefined functions are defined in is not recorded, and can in fact change. For example, you could use LD_PRELOAD=libfoo.so to inject a different printf defined in libfoo.so, and preempt the glibc definition of printf.



回答2:

TLDR: You have to compute the address of the GOT entry (stored in IP but the PLT) and find the relocation entry corresponding to this GOT entry. This relocation entry references the symbol name (through the dynamic symbol table and the dynamic string table).

Your example

This PLT entry computes the address of a PLTGOT entry in the IP register:

8320:   e28fc600    add ip, pc, #0, 12
8324:   e28cca08    add ip, ip, #8, 20  ; 0x8000
8328:   e5bcf344    ldr pc, [ip, #836]! ; 0x344

This computes the GOT entry of address: 0x8320 + 0x8 + 0x8000 + 0x344 = 0x1066c. There is a relocation entry in the relocation table which binds this GOT entry to a given symbol.

Another example

Let's take this PLT entry from my libc:

00015b98 :
   15b98:       e28fc601        add     ip, pc, #1048576        ; 0x100000
   15b9c:       e28cca2f        add     ip, ip, #192512 ; 0x2f000
   15ba0:       e5bcf46c        ldr     pc, [ip, #1132]!        ; 0x46c

The address of the GOT entry is: 0x15b98 + 0x8 + 0x100000 + 0x2f000 + 0x46c = 0x14500c.

If you want to know why "+ 0x8", this is because:

In ARM state, the value of the PC is the address of the current instruction plus 8 bytes.

Let's look at the relocation entry:

 Offset     Info    Type            Sym.Value  Sym. Name
0014500c  0001e416 R_ARM_JUMP_SLOT   00077c28   realloc

So this PLT entry is a PLT to realloc which is what we expected to get! \o/

Findinf the symbol name

You might want to know how the symbol name is found. In my example, the info field is 0x0001e416: this relocation uses the symbol entry 0x1e4 = 484 in the dynamic symbol table (.dynsym)

   Num:    Value  Size Type    Bind   Vis      Ndx Name
   484: 00077c28   760 FUNC    GLOBAL DEFAULT   11 realloc@@GLIBC_2.4

In fact, the realloc string is not found in the symbol table directly but in the string table (.dynstr). The symbol table stores the offset of the string within the string table.