Let us take a simple C code for setting a register:
int main()
{
int *a = (int*)111111;
*a = 0x1000;
return 0;
}
When I compile this code for ARM (arm-none-eabi-gcc) with level 1 optimization, the assembly code is something like:
mov r2, #4096
mov r3, #110592
str r2, [r3, #519]
mov r0, #0
bx lr
Looks like the address 111111 was resolved to the closest 4K boundary (110592) and moved to r3, and then the value 4096(0x1000) was stored by adding 519 to 110592 (=111111). Why does this happen?
In x86, the assembly is straightforward:
movl $4096, 111111
movl $0, %eax
ret
The reason behind this encoding, is because x86 has variable sized instructions -- from 1 byte up to 16 bytes (and possibly even more with prefixes).
ARM instruction is 32 bits wide (not counting Thumb modes), which means that it's simply not possible to encode all 32-bit wide constants (immediates) in a single opcode.
Fixed sized architectures typically use a few methods to load large constants:
1) movi #r1, Imm8 ; // Here Imm8 or ImmX is simply X least significant bits
2) movhi #r1, Imm16 ; // Here Imm16 loads the 16 MSB of the register
3) load #r1, (PC + ImmX); // use PC-relative address to put constant in code
4) movn #r1, Imm8 ; // load the inverse of Imm8 (for signed constants)
5) mov(i/n) #1, Imm8 << N; // where N=0,8,16,24
Variable sized architectures OTOH can put all the constants in a single instruction:
xx xx xx 00 10 00 00 11 11 11 00 ; // assuming that it takes 3 bytes to encode
; // the instruction and the addressing mode
; added with 4 bytes to encode the 4096 and 4 bytes to encode 0x00111111
The address had to be split in two parts because this specific constant cannot be loaded into a register with a single instruction.
The ARM documentation specifies limitations for the immediate constants allowed in some instructions (such as MOV
):
In ARM instructions, constant can have any value that can be produced
by rotating an 8-bit value right by any even number of bits within a
32-bit word.
In 32-bit Thumb-2 instructions, constant can be:
Any constant that can be produced by shifting an 8-bit value left by
any number of bits within a 32-bit word.
Any constant of the form 0x00XY00XY.
Any constant of the form 0xXY00XY00.
Any constant of the form 0xXYXYXYXY.
The value 111111
(1B207
in hex) can't be represented as any of the above, so the compiler had to split it.
110592
is 1B000
so it fulfills the first condition (an 8-bit value 0x1B rotated left by 12 bits) and can be loaded using MOV
instruction.
The STR
instruction, on the other hand, has a different set of limitations for the offsets used. In particular, 519 (0x207) falls into the -4095 to 4095 range allowed for the word store/load in ARM mode.
In this specific case the compiler managed to split the constant in only two parts. If your immediate has more bits, it may have to generate even more instructions, or use a literal pool load. For example, if I use 0xABCDEF78
, I get this (for ARMv7):
movw r3, #61439
movt r3, 43981
mov r2, #4096
str r2, [r3, #-135]
mov r0, #0
bx lr
For architectures without MOVW/MOVT (e.g. ARMv4), GCC seems to fall back to literal pool:
mov r2, #4096
ldr r3, .L2
str r2, [r3, #-135]
mov r0, #0
bx lr
.L3:
.align 2
.L2:
.word -1412567041
The compiler is probably taking advantage of ARM immediate value encoding to reduce code size. Basically 110592 is 0x1B << 12
and this enables some simplifications. Take a look at the output from arm-none-eabi-objdump -d
of your program to check the length of each instruction.