creating constant jump table; xcode; clang; asm

2019-05-10 00:18发布

问题:

I have quite strange issue when try to create the jump table in my asm program for iphone (arm64):

.globl my_func
my_func:
...
//jump (switch) table
.L.f_switch:
    .short .L.case0 - .L.f_switch
    .short .L.case1 - .L.f_switch
    ...
.L.case0:
//some case code
...
.L.case1:
//other case code 

After compilation this table is filled by zeros instead of actual values. It could be seen by dumping compiled object file.

(__TEXT,__text) section
_my_func:
0000000000000000    adr x4, #16
0000000000000004    ldrh    w5, [x4, x3, lsl #1]
0000000000000008    add x4, x4, w5, uxth
000000000000000c    br  x4
.L.f_switch:
0000000000000010    .long   0x00000000
0000000000000014    .long   0x00000000
0000000000000018    .long   0x00000000
000000000000001c    nop

How to resolve it?

回答1:

I believe that what you are observing with entries being set to 0 is related to relocation. The compiler may emit relocation information that the linker will ultimately resolve. To that end I created this small sample program:

test.s

.text
.align 4
.globl _main
_main:
    adr  x0, .L.f_switch
    ldr  w1, [x0, x1, LSL#2]
    add  x0, x0, x1
    br   x0

.L.f_switch:
    .word  .L.case0 - .L.f_switch
    .word  .L.case1 - .L.f_switch
    .word  .L.case2 - .L.f_switch

.L.case0:
    nop

.L.case1:
    nop

.L.case2:
    nop

    ret

I'm using XCode 7 and clang reports this version info for clang --version:

Apple LLVM version 7.0.0 (clang-700.0.72)
Target: x86_64-apple-darwin14.5.0
Thread model: posix

To simplify things at the command line I set an environment variable to point to my iPhone SDK with:

export ISYSROOT="/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/"

First experiment is to compile test.s to test.o. I use this command:

clang -x assembler  -arch arm64 test.s -o test.o -c

Now if I dump test.o with otool using:

otool -drGtv test.o

I get this:

test.o:
Data in code table (0 entries)
offset     length kind
Relocation information (__TEXT,__text) 6 entries
address  pcrel length extern type    scattered symbolnum/value
00000018 False long   True   SUB     False     .L.f_switch
00000018 False long   True   UNSIGND False     .L.case2
00000014 False long   True   SUB     False     .L.f_switch
00000014 False long   True   UNSIGND False     .L.case1
00000010 False long   True   SUB     False     .L.f_switch
00000010 False long   True   UNSIGND False     .L.case0
(__TEXT,__text) section
_main:
0000000000000000        adr     x0, #16
0000000000000004        ldr     w1, [x0, x1, lsl #2]
0000000000000008        add      x0, x0, x1
000000000000000c        br      x0
.L.f_switch:
0000000000000010        .long   0x00000000
0000000000000014        .long   0x00000000
0000000000000018        .long   0x00000000
.L.case0:
000000000000001c        nop
.L.case1:
0000000000000020        nop
.L.case2:
0000000000000024        nop
0000000000000028        ret

The compiler(assembler) has emitted relocation entries for 00000010, 00000014, and 00000018 for both parts of the equation (.L.case# and .L.F_switch). The table itself is filled with place holder zeros. It will be the linker's job to resolve the relocations. I can manually link the test.o above with a command like:

ld  -demangle -dynamic -arch arm64 -iphoneos_version_min 5.0.0 -syslibroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/ -o test -L/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk//usr/lib/system test.o -lSystem /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.0/lib/darwin/libclang_rt.ios.a

I can now use otool to dump the final executable with a command like:

otool -drGtv test

And get this output:

test:
Data in code table (0 entries)
offset     length kind
(__TEXT,__text) section
_main:
0000000100007f80        adr     x0, #16
0000000100007f84        ldr     w1, [x0, x1, lsl #2]
0000000100007f88        add      x0, x0, x1
0000000100007f8c        br      x0
.L.f_switch:
0000000100007f90        .long   0x0000000c
0000000100007f94        .long   0x00000010
0000000100007f98        .long   0x00000014
.L.case0:
0000000100007f9c        nop
.L.case1:
0000000100007fa0        nop
.L.case2:
0000000100007fa4        nop
0000000100007fa8        ret

Notice that all the relocations have been resolved by the linker in the final executable.

Alternatively I could have compiled and linked all in one step to produce the executable test with a command like:

clang -x assembler  -arch arm64 -L$ISYSROOT/usr/lib/system --sysroot=$ISYSROOT test.s -o test

I split it up to show what the object file looked like and then the resulting executable after linking.



回答2:

First of all, I want to thank Michael Petch for his contribute to this discussion which was very helpful.

Secondly, I want to highlight that the size of data in jump table is important. Clang doesn't have any issues with '.word' (4 Byte) offsets. While the troubles are beginning when other '.byte' (1 Byte) or '.short'/'.hword' (2 Byte) offsets are used.



Test 1. Data type is '.short' (2 Byte).

my_func:
...
//jump (switch) table
.L.f_switch:
    .short .L.case0 - .L.f_switch
    .short .L.case1 - .L.f_switch
    ...
.L.case0:
//some case code
...
.L.case1:
//other case code 

the dump is:

Relocation information (__TEXT,__text) 10 entries
address  pcrel length extern type    scattered symbolnum/value
00000018 False word   True   SUB     False     .L.f_switch
00000018 False word   True   UNSIGND False     .L.case4
00000016 False word   True   SUB     False     .L.f_switch
00000016 False word   True   UNSIGND False     .L.case3
00000014 False word   True   SUB     False     .L.f_switch
00000014 False word   True   UNSIGND False     .L.case2
00000012 False word   True   SUB     False     .L.f_switch
00000012 False word   True   UNSIGND False     .L.case1
00000010 False word   True   SUB     False     .L.f_switch
00000010 False word   True   UNSIGND False     .L.case0

(__TEXT,__text) section
_my_func:
0000000000000000 adr x4, #16
0000000000000004 ldrh w5, [x4, x3, lsl #1]
0000000000000008 add x4, x4, w5, uxth
000000000000000c br x4
.L.f_switch:
0000000000000010 .long 0x00000000
0000000000000014 .long 0x00000000
0000000000000018 .long 0x00000000
000000000000001c nop

till now is everything is going as Michael described in his answer (except there is reservation for 2 Bytes offset entities)

After that linker returns error:

in section __TEXT,__text reloc 0: ARM64_RELOC_SUBTRACTOR must have r_length of 2 or 3 for architecture arm64

Please note that there would not be any errors if 4 Bytes entities were used.



Test 2. Could be treated as workaround.

    .set case_0,     .L.case0 - .L.f_switch
    .set case_1,     .L.case1 - .L.f_switch
    .set case_2,     .L.case2 - .L.f_switch
    ...

.L.f_switch:
    .hword  case_0
    .hword  case_1
    .hword  case_2
    ...

the dump of this approach is:

(__TEXT,__text) section
_my_func:
0000000000000000 adr x4, #16
0000000000000004 ldrh w5, [x4, x3, lsl #1]
0000000000000008 add x4, x4, w5, uxth
000000000000000c br x4
.L.f_switch:
0000000000000010 .long 0x01200020
0000000000000014 .long 0x06900240
0000000000000018 .long 0x00000cc0
000000000000001c nop

As you could notice compiler fills the jump table straight by right offset values. As a result there is not relocation information and any issues with linker.


Also I want to bring attention the the following facts.

  • GNU GCC compiler produces the result as in "Test 2" (with filled jump table) for both "Test 1" and "Test 2" code.
  • GNU GCC compiler generates error if offset in table could not be fitted in current data type. For example 1 Byte data type is used & offset is bigger than 255. Clang do not generate any errors in such cases so programmer should manually control it.