Assembly segment declaration syntax

2020-04-21 01:21发布

问题:

I just started with ASM, and x8086 architecture, and I'm having certain issues following up with some of the examples that come with emu8086.

SSEG    SEGMENT STACK   'STACK'
DW      100h    DUP(?)
SSEG    ENDS

Okay SSeg , I guess that it's a label to be the stack segment, the SEGMENT keyword to indicate that ahead comes a segment, but what does STACK 'STACK' stand for?

And below, I think it means "Allocate (I don't know where) 100h 16 bit words, without values"
Is this correct and if so, where is it allocated?

回答1:

I assume that emu8086 supports, for declaring a segment, the same syntax as TASM does, which in turn supports the same syntax as MASM.


A segment is declared with <name> SEGMENT [attributes] or SEGMENT <name> [attributes]. The attributes are optional and default values are inferred if any is missing.

<name> can be any valid name not already defined (beware that the .MODEL directive defines some name, including _TEXT and _DATA).

The [attributes] are divided into five categories, each category has one or more values to chose from. Values from different categories are separated by whitespaces, no more that a single value for any category can appear.

Segment combination attribute
These attribute values define how two or more segments are combined together.

  • PUBLIC makes the linker concatenate segments with the same name defined in different modules (i.e. source files).
  • PRIVATE the opposite of the above, segments with the same name defined outside of the current module will not be concatenated (note the in the same file segments with the same name are still considered the same segment).
  • STACK same as PUBLIC but in the binary generated, metadata are created so that the OS set SS:SP to the end of a segment using this attribute (after concatenation has taken place).
  • MEMORY either an alias for STACK and typo in TASM manual.
  • COMMON all the segments with the same name will overlap instead of being concatenated. The final segment is as long as the longest COMMON segment.
  • VIRTUAL is used to declare a segment that must appear only once in the final binary, regardless of how many times is declared in all the modules.
  • AT places a segment at a specific address.
  • UNINIT mark the segment as containing uninitialized data (just like the .bss section in the ELF)

Segment class attribute

This is a quoted string that represent the segment class. The segment class is a string meaningful for the linker only, it helps ordering and recognizing purpose of a segment when creating the metadata in the final binary file.
The classes the linker recognizes are: _TEXT, FAR_DATA, FAR_BSS, _DATA, CONST, _BSS and STACK.

Segment alignment attribute
These values specify the alignment the segment must have.
In short they tell the linker at what multiple a segment can begin, for example PARA, for paragraph, 16 bytes, tells the linker that a segment can start at an address multiple of 16: 0, 16, 32, 48, ...

  • BYTE, alignment of 1
  • WORD, alignment of 2
  • DWORD, alignment of 4
  • PARA, alignment of 16
  • PAGE alignment of 256
  • MEMPAGE alignment of 4096

Segment size attribute
These values specify the size of the code and data of a segment.

  • USE16 tells the assembler that the code to be generated must be 16-bit and that the data accessed must use 16-bit address size.
  • USE32 the same as above but with 32-bit size.

Segment access attributes
These attributes, not supported by TLINK, tells the linker what access restriction put in the metadata for the segment.
This doesn't apply to DOS binaries.
The values are EXECONLY, EXECREAD, READONLY, READWRITE. The names are eloquent.


The segment definition SSEG SEGMENT STACK 'STACK' defines a segment with:

  • Name SSEG.
  • Combination attribute STACK, making the linker emit metadata to set SS:SP to point to the end of it.
  • Class attribute 'STACK' making the linker recognize it as a stack segment.

Making the linker aware that a segment is a stack segment is done with the STACK combination and the 'STACK' class.
The first controls the initial value of SS:SP1 and would be enough to have a stack.
The second specify the ordering and the grouping of the segment itself.

Segment grouping

The linker can group segments together, this is just like concatenating the segments but respecting the alignment constraints.
By grouping segments together it is possible to use a single segment register to access all of them.
Segment grouping is also used to logically grouping the segment, i.e. making the linker treat them all the same.

Particularly, TASM implicitly defines the DGROUP group that when using the .MODEL directives includes, among others, the segments with class 'DATA' and 'STACK'.
You can excludes the 'STACK' segments from the DGROUP with the FARSTACK option of the .MODEL directive.

So the class 'STACK' tell the linker that the segment must go (or must not go) into the DGROUP.
Furthermore inside that group 'STACK' segments are placed after any other segment class.


The final effect of the pair STACK 'STACK' is to:

  1. Initialize SS:SP.
  2. Placing the stack after the data.

The DW 100h DUP(?) line just do what you said, the correct term however is reserve as I believe there is no space allocated in the binary file for the stack. The linker may be smart enough to recognize that a 'STACK' classed segment with uninitialized data don't need to take place on disk.
But I may be wrong, I don't remember exactly if the MZ header of the binary file would permit that.

Another simpler way to declare a stack segment of predefined size is .STACK 200h (or simply .STACK if you are ok with using 1KiB of stack).


Chapter 7 of the TASM manual has more complete info on this long topic.


1 To better understand this point it's worth noting that the EXE generated has a header where the linker can specify the initial value for those registers and the OS will relocate and set them upon the loading of the binary.