I'm trying to understand what'd be the C equivalent of some nasm idioms like these ones:
%define CONSTANT1 1
%define CONSTANT2 2
1) section name_section data align=N
v1: dd 1.2345678
v2: dd 0x12345678
v3: dd 32767
v4:
v5: dd 1.0
v6:
dd 1.0, 2.0, 3.0, 4.0,
dd 5.0, 6.0, 7.0, 8.0
2) section name_section bss align=N
v7:
resd 1
3) global _function_name@0
section name_section code align=N
_function_name@0:
...
4) global _g_structure1
global _g_structure2
section name_section data align=N
_g_structure1:
dw 01h
dw 2
_g_structure2:
dd CONSTANT1
dd CONSTANT2
5) section section_name code align=N
function_name:
...
The nasm documentation here and here didn't clarify too much. Guess my questions are:
- How
dd
and similars are interpreted?
- It seems you can declare N sections of type {code, bss, data} with X bytes alignment, what's the meaning of that in C?
- There are functions with the @N suffix, what's the meaning of that?
- global... you declare global labels? in what scope? nasm files?
- v4: is empty, what does that mean?
dd
stores a sequence of DWORDS
given by the arguments. So dd 1
will store the 4-byte value 0x00000001 at the current location (since it's targeting a little endian architecture, you'll end up with the bytes 0x01 0x00 0x00 0x00
).
Sections aren't generally exposed directly in C - it's more of a lower level concern handled by compilers, linkers and runtime loaders. So in general your toolchain will handle the proper allocation of your code and data into sections. For example, the compiler will put the actual assembled code into .text
sections, and will put statically initialized data into .data
sections, and finally will put uninitialized or zero-initialized statically allocated data into .bss
sections, and so on. The details aren't really part of C itself and will vary by platform and executable format (for example, not all platforms have the same types of sections).
When using assembly, on the other hand, you need to be a bit more aware of sections. For example, if you have mutable data it is important that it ends up a different section than your code, since you don't want to run into read-only .text
sections, or self-modifying-code false positives, etc.
The section alignment is a directive to the runtime loader that tells it the minimum required alignment for the section. You can impact this in your C code using some compiler or platform specific options - e.g. if you request a statically allocated array to have an alignment of 32, then the .data
section may be promoted to at least 32-byte alignment. C doesn't have a standard way to actually request such alignment, but you can use platform specific extensions such as posix_memalign
, gcc's aligned
attribute, or even #pragma pack
. C++11 on the other hand has alignas
to do this in a standard way.
The @N
suffix is a result of stdcall name mangling
.
You can declare global labels with the help of the GLOBAL
directive in nasm. As Peter point out, this only modifies the attributes of a subsequently declared label, and doesn't actually declare the label itself (which is still done in the usual way). This directive has other format-specific options which let you, for example, declare your exported symbol as a function.
The NASM global label
directive does not actually declare label
. It just modifies what scope it will have when you do declare it, with label:
.
It's the opposite of C, where global is the default and you have to use static
to get non-exported symbols that are private to this compilation unit.
v4:
is empty, what does that mean?
Think of labels as zero-width pointers. The label itself has no size, it just labels that position in the binary. (And you can have multiple labels at the same location).
NASM has no types, so it's really quite similar to void*
.