Incorrect function size inside ARM ELF object

2019-03-03 22:19发布

问题:

readelf output of the object file:

Symbol table '.symtab' contains 15 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 00000000     0 FILE    LOCAL  DEFAULT  ABS fp16.c
     2: 00000000     0 SECTION LOCAL  DEFAULT    1 
     3: 00000000     0 SECTION LOCAL  DEFAULT    3 
     4: 00000000     0 SECTION LOCAL  DEFAULT    4 
     5: 00000000     0 NOTYPE  LOCAL  DEFAULT    1 $t
     6: 00000001   194 FUNC    LOCAL  DEFAULT    1 __gnu_f2h_internal
     7: 00000010     0 NOTYPE  LOCAL  DEFAULT    5 $d
     8: 00000000     0 SECTION LOCAL  DEFAULT    5 
     9: 00000000     0 SECTION LOCAL  DEFAULT    7 
    10: 000000c5    78 FUNC    GLOBAL HIDDEN     1 __gnu_h2f_internal
    11: 00000115     4 FUNC    GLOBAL HIDDEN     1 __gnu_f2h_ieee
    12: 00000119     4 FUNC    GLOBAL HIDDEN     1 __gnu_h2f_ieee
    13: 0000011d     4 FUNC    GLOBAL HIDDEN     1 __gnu_f2h_alternative
    14: 00000121     4 FUNC    GLOBAL HIDDEN     1 __gnu_h2f_alternative

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00000000 000034 000124 00  AX  0   0  4
  [ 2] .rel.text         REL             00000000 00058c 000010 08      9   1  4
  [ 3] .data             PROGBITS        00000000 000158 000000 00  WA  0   0  1
  [ 4] .bss              NOBITS          00000000 000158 000000 00  WA  0   0  1
  [ 5] .debug_frame      PROGBITS        00000000 000158 00008c 00      0   0  4
  [ 6] .rel.debug_frame  REL             00000000 00059c 000060 08      9   5  4
  [ 7] .ARM.attributes   ARM_ATTRIBUTES  00000000 0001e4 00002f 00      0   0  1
  [ 8] .shstrtab         STRTAB          00000000 000213 000051 00      0   0  1
  [ 9] .symtab           SYMTAB          00000000 00041c 0000f0 10     10  10  4
  [10] .strtab           STRTAB          00000000 00050c 00007e 00      0   0  1

Relocation section '.rel.text' at offset 0x58c contains 2 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
0000011a  00000a66 R_ARM_THM_JUMP11  000000c5   __gnu_h2f_internal
00000122  00000a66 R_ARM_THM_JUMP11  000000c5   __gnu_h2f_internal

Relocation section '.rel.debug_frame' at offset 0x59c contains 12 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00000014  00000802 R_ARM_ABS32       00000000   .debug_frame
00000018  00000202 R_ARM_ABS32       00000000   .text
00000040  00000802 R_ARM_ABS32       00000000   .debug_frame
00000044  00000202 R_ARM_ABS32       00000000   .text
00000050  00000802 R_ARM_ABS32       00000000   .debug_frame
00000054  00000202 R_ARM_ABS32       00000000   .text
00000060  00000802 R_ARM_ABS32       00000000   .debug_frame
00000064  00000202 R_ARM_ABS32       00000000   .text
00000070  00000802 R_ARM_ABS32       00000000   .debug_frame
00000074  00000202 R_ARM_ABS32       00000000   .text
00000080  00000802 R_ARM_ABS32       00000000   .debug_frame
00000084  00000202 R_ARM_ABS32       00000000   .text

.text section structure as I understand it:

.text section has size of 0x124

0x0: unknown byte
0x1-0xC3: __gnu_f2h_internal
0xC3-0xC5: two unknown bytes between those functions (btw what are those?)
0xC5-0x113: __gnu_h2f_internal
0x113-0x115: two unknown bytes between those functions
0x115-0x119: __gnu_f2h_ieee
0x119-0x11D: __gnu_h2f_ieee
0x11D-0x121: __gnu_f2h_alternative
0x121-0x125: __gnu_h2f_alternative // section is only 0x124, what happened to the missing byte?

Notice that the section size is 0x124 and the last function end in 0x125, what happend to the missing byte?

Thanks.

回答1:

Technically, your "missing byte" is the one right there at 0x0.

Note that you're looking at the value of the symbol, i.e. the runtime function address (this would be a lot clearer if your .text section VMA wasn't 0). Since they're Thumb functions, the addresses have bit 0 set such that the processor will switch to Thumb mode when calling them; the actual locations of those instructions are still halfword-aligned, i.e. 0x0, 0xc4, 0x114, etc. since they couldn't be executed otherwise (you'd take a fault for a misaligned PC). Strip off bit 0 as per what the ARM ELF spec says about STT_FUNC symbols to get the actual VMA of the instruction corresponding to that symbol, then subtract the start of the section and you should have the same relative offset as within the object file itself.

<offset in section> = (<symbol value> & ~1) - <section VMA>

The extra halfword padding after some functions just ensures each symbol is word-aligned - there are probably various reasons for this, but the first one that comes to mind is that the adr instruction wouldn't work properly if they weren't.



标签: arm elf readelf