AArch64 relocation prefixes

2020-02-28 10:32发布

问题:

I noticed a GNU asm relocation syntax for ARM 64-bit assembly. What are those pieces like #:abs_g0_nc: and :pg_hi21:? Where are they explained? Is there a pattern to them or are they made up on the go? Where can I learn more?

回答1:

Introduction

ELF64 defines two types of relocation entries, called REL and RELA:

typedef struct
{
    Elf64_Addr r_offset;    /* Address of reference */
    Elf64_Xword r_info;     /* Symbol index and type of relocation */
} Elf64_Rel;

typedef struct
{
    Elf64_Addr r_offset;    /* Address of reference */
    Elf64_Xword r_info;     /* Symbol index and type of relocation */
    Elf64_Sxword r_addend;  /* Constant part of expression */
} Elf64_Rela;

The scope of each relocation entry is to give the loader (static or dynamic) four pieces of information:

  • The virtual address or the offset of the instruction to patch.
    This is given by r_offset.

  • The runtime address of the symbol accessed.
    This is given by the higher part of r_info.

  • A custom value called addend
    This value, eventually, as an operand in the expression used to calculate the value that will be written to patch the instruction.
    RELA entries have this value in r_addend, REL entries extract it from the relocation site.

  • The relocation type This determines the type of expression uses to calculate the value to patch the instruction. This is encoded in the lower part of r_info.

Relocating

During the relocation phase the loader goes through all the relocation entries and write to the location specified by each r_offset, using a formula chosen by the lower part of r_info to compute the value to be stored from the addend (r_addend for RELA) and the symbol address (obtainable from the upper part of r_info).

Actually the write part has been simplified, contrary to other architecture where the immediate field of an instruction usually occupy entirely separate byes from the ones used to encode the operation, in ARM, the immediate value is mixed with other encoding information.
So the loader should know what kind of instruction is trying to relocate, if it is an instruction at all1, but instead of letting it disassemble the site of relocation, it is the assembler that set the relocation type according to the instruction.

Each relocation symbol can relocate only one or two, encoding-equivalent, instructions.
In specific case the relocation itself even change the type of instruction.

The value compute computed during the relocation is implicitly extended to 64 bits, signed or unsigned based on the relocation type chosen.

AArch64 relocation

Being ARM a RISC architecture with fixed instruction size, loading full width, i.e. 64 bits, immediate into a register is non trivial as no instruction can have a full width immediate field.

Relocation in AArch64 has to address this issue too, it is actually a two fold problem: first, find the real value that the programmer intended to use (this is the pure relocation part of the problem); second, find a way to put it into a register, since no instruction has a 64 bits immediate field.

The second issue is addressed by using group relocation, each relocation type in a group is used to compute a 16 bits part of the 64 bits value, therefore there can only be four relocation type in a group (ranging from G0 to G3).

This slicing into 16 bits comes to fit with the movk (move keeping), movz (move zeroing) and movn (move negating logically).
Other instructions, like b, bl, adrp, adr and so on, have a relocation type specially suited for them.

Whenever there is only one, thus unambiguous, possible relocation type for a given instruction that reference a symbol, the assembler can generate the corresponding entry without the need, for the programmer, to specify it explicitly.

Group relocation doesn't fit into this category, they exist to allow the programmer some flexibility, thus are generally explicitly stated. In a group, a relocation type can specify if the assembler must perform an overflow check or not.
A G0 relocation, used to load the lower 16 bits of a value, unless explicitly suppressed, check that the value can fit 16 bits (signed or unsigned, depending on the specific type used). The same is true for G1, that loading bits 31-16 check that the values can fits 32 bits.
As a consequence G3 is always non checking as every value fits 64 bits.

Finally, relocation can be used to load integer values into register. In fact, an address of a symbol is nothing more than an arbitrary integer constant.
Note that r_addend is 64 bits wide.


1 If r_offset points to a site in a data section the computed value is written as 64 bits word at the location indicated.

Relocation operators

First of all, some references:

  • The ARM document that describes the relocation types for the ELF64 format is here, section 4.6

  • A test AArch64 assembly file that, presumably, contains all the relocation operators available to GAS is here here

Conventions

Following the ARM document convention we have:

S is the runtime address of the symbol being relocated.
A is the addend for the relocation.
P is the address of the relocation site (derived from r_offset).
X is the result of a relocation operation, before any masking or bit-selection operation is applied.
Page(expr) is the page address of the expression expr, defined as expr & ~0xFFF, i.e. expr with the lower 12 bits cleared. GOT is the address of the Global Offset Table.
GDAT(S+A) represents a 64-bit entry in the GOT for address S+A. The entry will be relocated at run time with relocation R_AARCH64_GLOB_DAT(S+A).
G(expr) is the address of the GOT entry for the expression expr.
Delta(S) resolves to the difference between the static link address of S and the execution address of S. If S is the null symbol (ELF symbol index 0), resolves to the difference between the static link address of P and the execution address of P.
Indirect(expr) represents the result of calling expr as a function.
[msb:lsb] is a bit-mask operation representing the selection of bits in a value, bounds are inclusive.

Operators

The relocation name is missing the prefix R_AARCH64_ for the sake of compactness.

Expressions of the kind |X|≤2^16 are intended as -2^16 ≤ X < 2^16, note the strict inequality on the right.
This is an abuse of notation, called by the constrains of formatting a table.

Group relocations

Operator    | Relocation name | Operation | Inst | Immediate | Check
------------+-----------------+-----------+------+-----------+----------
:abs_g0:    | MOVW_UABS_G0    | S + A     | movz | X[15:0]   | 0≤X≤2^16
------------+-----------------+-----------+------+-----------+----------
:abs_g0_nc: | MOVW_UABS_G0_NC | S + A     | movk | X[15:0]   | 
------------+-----------------+-----------+------+-----------+----------
:abs_g1:    | MOVW_UABS_G1    | S + A     | movz | X[31:16]  | 0≤X≤2^32
------------+-----------------+-----------+------+-----------+----------
:abs_g1_nc: | MOVW_UABS_G1_NC | S + A     | movk | X[31:16]  | 
------------+-----------------+-----------+------+-----------+----------
:abs_g2:    | MOVW_UABS_G2    | S + A     | movz | X[47:32]  | 0≤X≤2^48
------------+-----------------+-----------+------+-----------+----------
:abs_g2_nc: | MOVW_UABS_G2_NC | S + A     | movk | X[47:32]  | 
------------+-----------------+-----------+------+-----------+----------
:abs_g3:    | MOVW_UABS_G3    | S + A     | movk | X[64:48]  | 
            |                 |           | movz |           |
------------+-----------------+-----------+------+-----------+----------
:abs_g0_s:  | MOVW_SABS_G0    | S + A     | movz | X[15:0]   | |X|≤2^16
            |                 |           | movn |           |
------------+-----------------+-----------+------+-----------+----------
:abs_g1_s:  | MOVW_SABS_G1    | S + A     | movz | X[31:16]  | |X|≤2^32
            |                 |           | movn |           |
------------+-----------------+-----------+------+-----------+----------
:abs_g2_s:  | MOVW_SABS_G2    | S + A     | movz | X[47:32]  | |X|≤2^48
            |                 |           | movn |           |
------------+-----------------+-----------+------+-----------+----------

In the table the ABS version is showed, the assembler can pickup the PREL (PC relative) or the GOTOFF (GOT relative) version depending on the symbol referenced and the type of output format.

A typical use of this relocation operators is

Unsigned 64 bits                      Signed 64 bits   
movz    x1,#:abs_g3:u64               movz  x1,#:abs_g3_s:u64
movk    x1,#:abs_g2_nc:u64            movk  x1,#:abs_g2_nc:u64
movk    x1,#:abs_g1_nc:u64            movk  x1,#:abs_g1_nc:u64
movk    x1,#:abs_g0_nc:u64            movk  x1,#:abs_g0_nc:u64

Usually one one checking operator is used, the one that set the highest part.
That's why checking version relocates movz only, while the non checking version relocates movk (which partially set a register).
G3 relocated both because it is intrinsically non checking as no value can exceed 64 bits.

The signed versions ends with _s and they are always checking.
There is no G3 version because if a 64 bits value is used the sign if sully specified in the value itself.
They are always used only to set the highest part, as the sign is relevant only there.
They are always checking as an overflow in a signed value make the value meaning less.
These relocations change the type of the instruction to movn or movz based on the sign of the value, this effectively sign extend the value.

Group relocations, are also available

PC-relative, 19, 21, 33 bits addresses

Operator    | Relocation name | Operation | Inst | Immediate | Check
------------+-----------------+-----------+------+-----------+----------
[implicit]  | LD_PREL_LO19    | S + A - P | ldr  | X[20:2]   | |X|≤2^20
------------+-----------------+-----------+------+-----------+----------
[implicit]  | LD_PREL_LO21    | S + A - P | adr  | X[20:0]   | |X|≤2^20
------------+-----------------+-----------+------+-----------+----------
[implicit]  | LD_PREL_LO21    | S + A - P | adr  | X[20:0]   | |X|≤2^20
------------+-----------------+-----------+------+-----------+----------
:pg_hi21:   | ADR_PREL_PG     | Page(S+A) | adrp | X[31:12]  | |X|≤2^32
            | _HI21           | - Page(P) |      |           |
------------+-----------------+-----------+------+-----------+----------
:pg_hi21_nc:| ADR_PREL_PG     | Page(S+A) | adrp | X[31:12]  | 
            | _HI21_NC        | - Page(P) |      |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | ADD_ABS_LO12_NC | S + A     | add  | X[11:0]   | 
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST8_ABS_LO12  | S + A     | ld   | X[11:0]   | 
            | _NC             |           | st   |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST16_ABS_LO12 | S + A     | ld   | X[11:1]   | 
            | _NC             |           | st   |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST32_ABS_LO12 | S + A     | ld   | X[11:2]   | 
            | _NC             |           | st   |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST64_ABS_LO12 | S + A     | prfm | X[11:3]   | 
            | _NC             |           |      |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST128_ABS     | S + A     | ?    | X[11:4]   | 
            | _LO12_NC        |           |      |           |

The :lo12: change meaning depending on the size of the data the instruction is handling (e.g. ldrb uses LDST8_ABS_LO12_NC, ldrh uses LDST16_ABS_LO12_NC).

A GOT relative version of these relocations also exists, the assembler will pickup the right one.

Control flow relocations

Operator    | Relocation name | Operation | Inst | Immediate | Check
------------+-----------------+-----------+------+-----------+----------
[implicit]  | TSTBR14         | S + A - P | tbz  | X[15:2]   | |X|≤2^15
            |                 |           | tbnz |           |  
------------+-----------------+-----------+------+-----------+----------
[implicit]  | CONDBR19        | S + A - P | b.*  | X[20:2]   | |X|≤2^20
------------+-----------------+-----------+------+-----------+----------
[implicit]  | JUMP26          | S + A - P | b    | X[27:2]   | |X|≤2^27
------------+-----------------+-----------+------+-----------+----------
[implicit]  | CALL26          | S + A - P | bl   | X[27:2]   | |X|≤2^27
------------+-----------------+-----------+------+-----------+----------

Epilogue

I couldn't find an official documentation.
The tables above have been reconstructed from the GAS test case and the ARM document explaining the type of relocations available for AArch64 compliant ELFs.

The tables doesn't show all the relocations present in the ARM document, as most of them are complementary versions, picked up by the assembler automatically.

A section with examples would be great, but I don't have an ARM GAS.
In the future I may extend this answer to include examples of assembly listings and relocations dumps.