When are GAS ELF the directives .type, .thumb, .si

2019-02-01 23:22发布

I'm working on an assembly program for an ARM Cortex-M3 based microcontroller (Thumb 2 instruction set), using GNU as.

In some example code I find directives like .size, .section and .type which I understand are ELF directives. As an example:

    .section    .text.Reset_Handler
    .weak       Reset_Handler
    .type       Reset_Handler, %function  
Reset_Handler:
    bl      main
    b       Infinite_Loop    
    .size   Reset_Handler, .-Reset_Handler



The .type directive is said to set the type of a symbol - usually either to %object (meaning data?) or %function. I do not know what difference it makes. It is not always included, so I am unsure when it needs to be used.

Also related to this is the .thumb_func directive. From what I have read it seems like it might be equivalent of:

.thumb 
.type Symbol_Name, %function

Or is it something completely different?



.size supposedly sets the size associated with a symbol. When this is needed, I have no idea. Is this calculated by default, but overrideable with this directive? If so - when would you want to override?



.section is easier to find docs on, and I think I have a fair idea of what it does, but I am still a little bit unsure about the usage. The way I understand it, it switches between different ELF sections (text for code, data for writable data, bss for uninitialized data, rodata for constants, and others), and defines new ones when desired. I guess you would switch between these depending on whether you define code, data, uninitialized data, etc. But why would you create a subsection for a function, as in the example above?


Any help with this is appreciated. If you can find links to tutorials or docs that explain this in greater detail - preferably understandable for a novice - I would be very grateful.

So far, the Using as manual has been of some help - maybe you can get more out of it than me, with more knowledge.

3条回答
\"骚年 ilove
2楼-- · 2019-02-02 00:03

I came across this when trying to figure out why ARM and Thumb interworking broke with recent binutils (verified with 2.21.53 (MacPorts), also 2.22 (Yagarto 4.7.1)).

From my experience, .thumb_func worked fine with earlier binutils to generate the correct interworking veneers. However, with the more recent releases, the .type *name*, %function directive is needed to ensure proper veneer generation.

binutils mailing list post

I'm too lazy to dig up an older version of binutils to check if the .type directive is sufficient in place of .thumb_func for earlier binutils. I guess there is no harm in including both directives in your code.

Edited: updated comment on using .thumb_func in the code, apparently it works for ARM->Thumb interworking to flag the Thumb routine to generate veneers, but Thumb->ARM interworking fails unless the .type directive is used to flag the ARM function.

查看更多
Lonely孤独者°
3楼-- · 2019-02-02 00:19

Sections of your program are tightly related to the ELF format in which most systems (Linux, BSD, ...) store their object and executable files. This article should give you a good insight about how ELF works, which will help you understand the why of sections.

Simply put, sections let you organize your program into different memory areas which have different properties, including address, permission to execute and write, etc. During the final link stage, the linker uses a particular linker script that usually groups all sections of the same name together (e.g. all code from all compilation units together, ...) and assigns them a final address in memory.

For embedded systems their use is particularly obvious: first, the boot code (usually contained in the .text section) must be loaded at a fixed address in order to be executed. Then, read-only data can be grouped into a dedicated read-only section that will be mapped into the ROM area of the device. Last example: operating systems have initialization functions that are only called once and then never used afterwards, wasting precious memory space. If all these initialization functions are grouped together into a dedication section called, say, .initcode, and if this section is set to be the last section of the program, then the operating system can easily reclaim this memory once initialization is finished by lowering the upper limit of its own memory. Linux for instance is known to use that trick, and GCC allows you to place a variable or method into a specific section by postfixing it with __attribute__ ((section ("MYSECTION")))

.type and .size are actually still quite unclear to me too. I see them as helpers for the linker and never saw them outside of assembler-generated code.

.thumb_func seems to only be needed for the old OABI interface in order to allow interworking with Arm code. Unless you are using an old toolchain, you probably don't have to worry about it.

查看更多
Lonely孤独者°
4楼-- · 2019-02-02 00:22

I have been programming arm/thumb for many years lots of assembler and have needed very few of the many directives out there.

.thumb_func is quite important as pointed out by another responder.

for example

.globl _start
_start:
    b   reset

reset:

.arm

.globl one
one:
    add r0,r0,#1
    bx lr

.thumb

.globl two
two:
    add r0,r0,#2
    bx lr

.thumb_func
.globl three
three:
    add r0,r0,#3
    bx lr


.word two
.word three

.arm or used to be something like .code32 or .code 32 tells it this is arm code not thumb code, which for your cortex-m3 you wont need to use.

.thumb likewise, used to be .code 16 or maybe that still works, same deal makes the following code thumb not arm.

If the labels you are using are not global labels that you need to branch to from other files or indirectly, then wont need the .thumb_func. But in order for the address of a branch to one of these global labels to be computed properly (lsbit is a 1 for thumb and 0 for arm) you want to mark it as a thumb or arm label and the thumb_func does that, otherwise you have to set that bit before branching adding more code and the label is not callable from C.


00000000 <_start>:
   0:   eaffffff    b   4 <one>

00000004 <one>:
   4:   e2800001    add r0, r0, #1
   8:   e12fff1e    bx  lr

0000000c <two>:
   c:   3002        adds    r0, #2
   e:   4770        bx  lr

00000010 <three>:
  10:   3003        adds    r0, #3
  12:   4770        bx  lr
  14:   0000000c    andeq   r0, r0, ip
  18:   00000011    andeq   r0, r0, r1, lsl r0

Up to the .thumb the assembler is arm code as desired.

Both the two and three labels/functions are thumb code as desired but the two label has an even numbered address and three has the proper odd numbered address.

The latest codesourcery tools were used to assemble, link, and dump the above sample.

Now for the cortex-m3 where everything is thumb(/thumb2) thumb_func may not be as important, it may just work with command line switches (very easy to do an experiment to find out). It is a good habit to have though in case you move away from a thumb only processor to a normal arm/thumb core.

Assemblers generally like to add all of these directive and other ways of making things look/feel more like a high level language. I am just saying you dont have to use them, I switched assemblers for arm and use many different assemblers for many different processors and prefer the less is more approach, meaning focus on the assembly itself and use as few tool specific items as possible. I am usually the exception not the rule though, so you can probably figure out the more often used directives by looking at what directives the compiler output generates (and verify with documentation).

unsigned int one ( unsigned int x )
{
    return(x+1);
}


    .arch armv5te
    .fpu softvfp
    .eabi_attribute 20, 1
    .eabi_attribute 21, 1
    .eabi_attribute 23, 3
    .eabi_attribute 24, 1
    .eabi_attribute 25, 1
    .eabi_attribute 26, 2
    .eabi_attribute 30, 2
    .eabi_attribute 18, 4
    .file   "bob.c"
    .text
    .align  2
    .global one
    .type   one, %function
one:
    .fnstart
.LFB0:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    add r0, r0, #1
    bx  lr
    .fnend
    .size   one, .-one
    .ident  "GCC: (Sourcery G++ Lite 2010.09-50) 4.5.1"
    .section    .note.GNU-stack,"",%progbits

I do use the .align when mixing arm and thumb assembler or data in with assembler, you would expect the assembler for such a platform to know something as obvious as thumb instructions are on halfword boundaries and arm instructions are aligned on word boundaries. The tools are not always that smart. sprinkling .aligns about wont hurt

.text is the default so that is a bit redundant, but wont hurt. .text and .data are standard attributes (not specific to arm) if you are compiling for a combination of rom and ram on your target you may care (depends on what you do with your linker script), otherwise .text will work for everything.

.size apparently the size of the function start to that directive. The assembler cannot figure this out on its own, so if the size of this function is important for your code, linker script, debugger, loader, whatever then this needs to be right, otherwise you dont have to bother. A function is a high level concept anyway assembler doesnt really have functions much less a need to declare their size. And the C compiler certainly doesnt care, it is only looking for a label to branch to and in the case of the arm family is it thumb code or arm code that is being branched to.

you may find the .pool directive (there is a newer equivalent) useful if you are lazy with your immediates (ldr rx,=0x12345678) on long stretches of code. Here again the tools are not always smart enough to place this data after an unconditional branch, you sometimes have tell them. I say lazy half seriously, it is painful to do the label: .word thing all the time and I believe both the arm and gcc tools allowed for that shortcut, so I use it as much as anyone else.

Also note llvm outputs an additional .eabi_attribute or two that is supported by code sourcery's version/mods to binutils but not supported (perhaps yet) by the gnu released binutils. Two solutions that work, modify llvm's asm print function to not write the eabi_attributes or at least write them with a comment (@), or get the binutils source/mods from code sourcery and build binutils that way. code sourcery tends to lead gnu (thumb2 support for example) or perhaps backports new features, so I assume these llvm attrubutes will be present in the mainline binutils before long. I have suffered no ill effects by trimming the eabi_attributes off of the llvm compiled code.

Here is the llvm output for the same function above, apparently this is the llc that I modified to comment out the eabi_attributes.

    .syntax unified
@   .eabi_attribute 20, 1
@   .eabi_attribute 21, 1
@   .eabi_attribute 23, 3
@   .eabi_attribute 24, 1
@   .eabi_attribute 25, 1
@   .eabi_attribute 44, 1
    .file   "bob.bc"
    .text
    .globl  one
    .align  2
    .type   one,%function
one:                                    @ @one
@ BB#0:                                 @ %entry
    add r0, r0, #1
    bx  lr
.Ltmp0:
    .size   one, .Ltmp0-one

The elf file format is well documented and very easy to parse if you want to really see what the elf specific directives (if any) are doing. Many of these directives are to help the linker more than anything. .thumb_func, .text, .data for example.

查看更多
登录 后发表回答