GCC C++ Exception Handling Implementation

2019-03-15 21:34发布

问题:

I would like to know how GCC implements exception handling for C++ programs. I couldn't find an easy-to-understand and self-explanatory article on the Web (although there are many such articles for Visual C++). All I know is that GCC's implementation is called DWARF exception handling.

I have written a small C++ program and translated it into assembly with the command:

g++ main.cpp -S -masm=intel -fno-dwarf2-cfi-asm

The main.cpp and main.s files are given here. Could anyone please explain the contents of the main.s file, especially the sections .gcc_except_table and .eh_frame line-by-line? (My OS is Ubuntu 13.04 32-bit.) Thanks!

main.cpp:

void f()
{
    throw 1;
}

int main()
{
    int j;
    try {
        f();
    } catch (int i) {
        j = i;
    }   
    return 0;
}

main.s:

.file "main.cpp"
.intel_syntax noprefix
.text
.globl  _Z1fv
.type   _Z1fv, @function
_Z1fv:
.LFB0:
    push    ebp
.LCFI0:
    mov ebp, esp
.LCFI1:
    sub esp, 24
    mov DWORD PTR [esp], 4
    call    __cxa_allocate_exception
    mov DWORD PTR [eax], 1
    mov DWORD PTR [esp+8], 0
    mov DWORD PTR [esp+4], OFFSET FLAT:_ZTIi
    mov DWORD PTR [esp], eax
    call    __cxa_throw
.LFE0:
    .size   _Z1fv, .-_Z1fv
    .globl  main
    .type   main, @function
main:
.LFB1:
    push    ebp
.LCFI2:
    mov ebp, esp
.LCFI3:
    and esp, -16
    sub esp, 32
.LEHB0:
    call    _Z1fv
.LEHE0:
.L7:
    mov eax, 0
    jmp .L9
.L8:
    cmp edx, 1
    je  .L6
    mov DWORD PTR [esp], eax
.LEHB1:
    call    _Unwind_Resume
.LEHE1:
.L6:
    mov DWORD PTR [esp], eax
    call    __cxa_begin_catch
    mov eax, DWORD PTR [eax]
    mov DWORD PTR [esp+24], eax
    mov eax, DWORD PTR [esp+24]
    mov DWORD PTR [esp+28], eax
    call    __cxa_end_catch
    jmp .L7
.L9:
    leave
.LCFI4:
    ret
.LFE1:
    .globl  __gxx_personality_v0
    .section    .gcc_except_table,"a",@progbits
    .align 4
.LLSDA1:
    .byte   0xff
    .byte   0
    .uleb128 .LLSDATT1-.LLSDATTD1
.LLSDATTD1:
    .byte   0x1
    .uleb128 .LLSDACSE1-.LLSDACSB1
.LLSDACSB1:
    .uleb128 .LEHB0-.LFB1
    .uleb128 .LEHE0-.LEHB0
    .uleb128 .L8-.LFB1
    .uleb128 0x1
    .uleb128 .LEHB1-.LFB1
    .uleb128 .LEHE1-.LEHB1
    .uleb128 0
    .uleb128 0
.LLSDACSE1:
    .byte   0x1
    .byte   0
    .align 4
    .long   _ZTIi
.LLSDATT1:
    .text
    .size   main, .-main
    .section    .eh_frame,"a",@progbits
.Lframe1:
    .long   .LECIE1-.LSCIE1
.LSCIE1:
    .long   0
    .byte   0x1
    .string "zPL"
    .uleb128 0x1
    .sleb128 -4
    .byte   0x8
    .uleb128 0x6
    .byte   0
    .long   __gxx_personality_v0
    .byte   0
    .byte   0xc
    .uleb128 0x4
    .uleb128 0x4
    .byte   0x88
    .uleb128 0x1
    .align 4
.LECIE1:
.LSFDE1:
    .long   .LEFDE1-.LASFDE1
.LASFDE1:
    .long   .LASFDE1-.Lframe1
    .long   .LFB0
    .long   .LFE0-.LFB0
    .uleb128 0x4
    .long   0
    .byte   0x4
    .long   .LCFI0-.LFB0
    .byte   0xe
    .uleb128 0x8
    .byte   0x85
    .uleb128 0x2
    .byte   0x4
    .long   .LCFI1-.LCFI0
    .byte   0xd
    .uleb128 0x5
    .align 4
.LEFDE1:
.LSFDE3:
    .long   .LEFDE3-.LASFDE3
.LASFDE3:
    .long   .LASFDE3-.Lframe1
    .long   .LFB1
    .long   .LFE1-.LFB1
    .uleb128 0x4
    .long   .LLSDA1
    .byte   0x4
    .long   .LCFI2-.LFB1
    .byte   0xe
    .uleb128 0x8
    .byte   0x85
    .uleb128 0x2
    .byte   0x4
    .long   .LCFI3-.LCFI2
    .byte   0xd
    .uleb128 0x5
    .byte   0x4
    .long   .LCFI4-.LCFI3
    .byte   0xc5
    .byte   0xc
    .uleb128 0x4
    .uleb128 0x4
    .align 4
.LEFDE3:
    .ident  "GCC: (Ubuntu/Linaro 4.7.3-1ubuntu1) 4.7.3"
    .section    .note.GNU-stack,"",@progbits

回答1:

.eh_frame layout is described briefly in the LSB documentation. Ian Lance Taylor (author of the gold linker) also made some blog posts on .eh_frame and .gcc_except_table layout.

For a more reference-like description, check my Recon 2012 slides (start at 37 or so).

EDIT: here's the commented structures from your sample. First, the .eh_table (some parts omitted for clarity):

.Lframe1:                     # start of CFI 1
    .long   .LECIE1-.LSCIE1   # length of CIE 1 data
.LSCIE1:                      # start of CIE 1 data
    .long   0                 # CIE id
    .byte   0x1               # Version
    .string "zPL"             # augmentation string:
                              # z: has augmentation data
                              # P: has personality routine pointer
                              # L: has LSDA pointer
    .uleb128 0x1              # code alignment factor
    .sleb128 -4               # data alignment factor
    .byte   0x8               # return address register no.
    .uleb128 0x6              # augmentation data length (z)
    .byte   0                 # personality routine pointer encoding (P): DW_EH_PE_ptr|DW_EH_PE_absptr
    .long   __gxx_personality_v0 # personality routine pointer (P)
    .byte   0                 # LSDA pointer encoding: DW_EH_PE_ptr|DW_EH_PE_absptr
    .byte   0xc               # Initial CFI Instructions
    [...]
    .align 4
.LECIE1:                      # end of CIE 1
    [...]

.LSFDE3:                      # start of FDE 3
    .long   .LEFDE3-.LASFDE3  # length of FDE 3
.LASFDE3:                     # start of FDE 3 data
    .long   .LASFDE3-.Lframe1 # Distance to parent CIE from here
    .long   .LFB1             # initial location                
    .long   .LFE1-.LFB1       # range length                    
    .uleb128 0x4              # Augmentation data length (z)    
    .long   .LLSDA1           # LSDA pointer (L)                
    .byte   0x4               # CFI instructions                
    .long   .LCFI2-.LFB1
    [...]
    .align 4
.LEFDE3:                      # end of FDE 3

Next, the LSDA (language-specific data area) referenced by FDE 3:

.LLSDA1:                           # LSDA 1
    .byte   0xff                   # LPStart encoding: DW_EH_PE_omit
    .byte   0                      # TType encoding: DW_EH_PE_ptr|DW_EH_PE_absptr
    .uleb128 .LLSDATT1-.LLSDATTD1  # TType offset
.LLSDATTD1:                        # LSDA 1 action table
    .byte   0x1                    # call site encoding: DW_EH_PE_uleb128|DW_EH_PE_absptr
    .uleb128 .LLSDACSE1-.LLSDACSB1 # call site table length
.LLSDACSB1:                        # LSDA 1 call site entries
    .uleb128 .LEHB0-.LFB1          # call site 0 start
    .uleb128 .LEHE0-.LEHB0         # call site 0 length
    .uleb128 .L8-.LFB1             # call site 0 landing pad
    .uleb128 0x1                   # call site 0 action (1=action 1)
    .uleb128 .LEHB1-.LFB1          # call site 1 start
    .uleb128 .LEHE1-.LEHB1         # call site 1 length
    .uleb128 0                     # call site 1 landing pad
    .uleb128 0                     # call site 1 action (0=no action)
.LLSDACSE1:                        # LSDA 1 action table entries
    .byte   0x1                    # action 1 filter (1=T1 typeinfo)
    .byte   0                      # displacement to next action (0=end of chain)
    .align 4
    .long   _ZTIi                  # T1 typeinfo ("typeinfo for int")
.LLSDATT1:                         # LSDA 1 TTBase


回答2:

The Itanium ABI (which both gcc, clang and a number of others follow) specify that exception handling should follow the Zero-Cost strategy.

The idea of the Zero-Cost strategy is to push all exception handling in side-tables that are not kept on the main program execution path (and thus not trashing the instruction cache). These tables are indexed by the program point.

Furthermore, DWARF information (which is debug information really) is used to unwind the stack. This functionality is usually provided as a library such as libunwind for example, the source code is chokeful of assembly (and thus very platform specific).

Advantages:

  • 0-cost for entering try/catch block (as fast as if there was none)
  • 0-cost for having a throw statement in a function (as long as it is not taken)

Disadvantage:

  • Slow in case of exception (10x slower than an if strategy) because the side tables are usually not in cache and then there are expensive computations to run to know which catch clause actually matches (based on RTTI)

It is a very popular strategy implement on both 32 bits and 64 bits platform for all major compilers... except MSVC 32 bits (if I remember correctly).