decode ARM BL instruction

2019-08-21 18:15发布

问题:

I'm just getting started with the ARM architecture on my Nucleo STM32F303RE, and I'm trying to understand how the instructions are encoded.

I have running a simple LED-blinking program, and the first few disassembled application instructions are:

08000188:   push    {lr}
0800018a:   sub     sp, #12
235         __initialize_hardware_early ();
0800018c:   bl      0x80005b8 <__initialize_hardware_early>

These instructions resolve to the following in the hex file (displayed weird in Eclipse -- each 32-bit word is in MSB order, but Eclipse doesn't seem to know it... but that's for another topic):

address 0x08000188:  B083B500 FA14F000

Using the ARM Architecture Ref Manual, I've confirmed the first 2 instructions, push (0xB500) and sub (0xB083). But I can't make any sense out of the "bl" instruction.

The hex instruction is 0xFA14F000. The Ref Manual says it breaks down like this:

31.28   27 26 25 24   23............0
cond     1  0  1  L   signed_immed_24

The first "F" (0xF......) makes sense: all conditions are set (ALways).

The "A" doesn't make sense though, since the L bit should be set (1011). Shouldn't it be 0xFB......?

And the signed_immed_24 doesn't make sense, either. The ref manual says:

- start with 0x14F000
- sign extend to 30 bits (signed 2's-complement), giving 0x0014F000
- shift left to form 32-bit value, giving 0x0053C000
- add to the PC, which is the current instruction + 8, giving 0x0800018c + 8 + 0x0053C000, or 0x0853C194.

So I get a branch address of 0x0853C194, but the disassembly shows 0x080005B8.

What am I missing?

Thanks! -Eric

回答1:

bl is two, separate, 16 bit instructions. The armv5 (and older) ARM ARM does a better job of documenting them.

111HHoffset11

From the ARM ARM

The first Thumb instruction has H == 10 and supplies the high part of the branch offset. This instruction sets up for the subroutine call and is shared between the BL and BLX forms.

The second Thumb instruction has H == 11 (for BL) or H == 01 (for BLX). It supplies the low part of the branch offset and causes the subroutine call to take place.

0xFA14 0xF000

0xF000 is the first instruction upper offset is zeros 0xFA14 is the second instruction offset is 0x214

If starting at 0x0800018c then it is 0x0800018C + 4 + (0x0000214<<1) = 0x080005B8. The 4 is the two instructions head for the current PC. And the offset is units of (16 bit) instructions.

I guess the armv7-m ARM ARM covers it as well, but is harder to read, and apparently features were added. But they do not affect you with this branch link.

The ARMv5 ARM ARM does a better job of describing what happens as well. you can certaily take these two separate instructions and move them apart

.byte 0x00,0xF0
nop
nop
nop
nop
nop
.byte 0x14,0xFA

and it will branch to the same offset (relative to the second instruction). Maybe the broke that in some cores, but I know in some (after armv5) it works.