Enable neon on ARM cortex-a series

2019-07-31 04:10发布

问题:

I want to initialize on a bare metal cortex A-15 the NEON cp. After following ARM's directives I wrote this sequence at the end of my platform init sequence:

MOV r0, #0x00F00000
MRC p15, 0, r0, c1, c1, 2
ORR r0, r0, #0x0C00 
BIC r0, r0, #0xC000 
MCR p15, 0, r0, c1, c1, 2
ISB
MRC p15, 4, r0, c1, c1, 2
BIC r0, r0,  #0x0C00
BIC r0, r0, #(3<<14)
MCR p15, 4, r0, c1, c1, 2
ISB
MOV r3, #0x40000000
VMSR FPEXC, r3

I get this error:

Error: operand 0 must be FPSCR -- `vmsr FPEXC,r3'

I am using arm-eabi-as --version:

GNU assembler (GNU Binutils) 2.21
Copyright 2010 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `arm-eabi'.

If I change FPEXC to FPSCR the program compiles and the running raise unhandler exception:

MRC p15, 4, r0, c1, c1, 2

回答1:

A sequence for initializing the VFPU can be found in u-boot source.

.macro init_vfpu
  ldr r0, =(0xF << 20)
  mcr p15, 0, r0, c1, c0, 2
  mov r3, #0x40000000
  .long 0xeee83a10
  /* vmsr FPEXC, r3 */
.endm /* init_vfpu */

As documented in the binutils mailing list, the vmsr FPEXC bug has been fixed in the binutils 2.23 branch as well as the HEAD and the 2.24 development branch which will be released shortly. Fixes exist in the 2.23.1 and 2.23.2 releases of binutils.

Here is a sample session,

$ cat t.S
init_vpu:
  ldr r0, =(0xF << 20)
  mcr p15, 0, r0, c1, c0, 2
  mov r3, #0x40000000
  vmsr FPEXC, r3
  bx  lr
  .ltorg
$ arm-none-linux-gnueabi-as -march=armv7-a -mcpu=cortex-a15 -mfpu=neon t.S -o t.o
$ arm-none-linux-gnueabi-as --version | grep assembler
GNU assembler (crosstool-NG hg+default-86a8d1d467c8) 2.23.1
This assembler was configured for a target of `arm-none-linux-gnueabi'.
$ objdump --version | grep Binutils
GNU objdump (GNU Binutils for Ubuntu) 2.23.2
$ objdump -S t.o 

t.o:     file format elf32-littlearm

Disassembly of section .text:

00000000 <init_vpu>:
   0:   e3a0060f        mov     r0, #15728640   ; 0xf00000
   4:   ee010f50        mcr     15, 0, r0, cr1, cr0, {2}
   8:   e3a03101        mov     r3, #1073741824 ; 0x40000000
   c:   eee83a10        vmsr    fpexc, r3
  10:   e12fff1e        bx      lr

The above sequence should work for all of the Cortex-A series. The sequence is for a system without virtualization or TrustZone active.



回答2:

You need to update binutils from git. This issue has been fixed earlier this year in this commit, but apparently there has been no release since then. Alternatively, encode the instruction manually.