Assembly:
[BITS 16]
global _start
_start:
mov ax, 0x07C0
mov ds, ax
mov si, hw
call print_string
jmp $
print_string:
mov ah, 0x0E
.char:
lodsb
cmp al, 0
je .exit
int 0x10
jmp .char
.exit: ret
times 0x100-($-$$) db 0
hw: db "Hello, World!", 0
times 510-($-$$) db 0
dw 0xAA55
Assembling this with:
$ nasm file.asm -felf -o file.o
And then linking it with:
$ ld -melf_i386 -o file.bin file.o --oformat binary
Gives the following error:
file.asm:(.text+0x6): relocation truncated to fit: R_386_16 against `.text'
After fiddling with the code a bit, i figured out that changing mov si, hw
to mov si, 0x100
works fine. But then what's the point of labels?
My guess is that ld can't generate 16 bits binary files, so it replaces hw
with a 32 bit address instead of a 16 bit address. And then it complains because my program tries to put a 32 bit value into a 16 bit register.
Is there some argument i can pass to nasm/ld to make this work?
EDIT:
elf doesn't support 16 bit, the only output format nasm supports wich actually states it supports 16 bit in nasm -hf
is .obj, but i can't find a linker for it.
NASM Manual:
The ELF32 specification doesn't provide relocations for 8- and 16-bit values, but the GNU ld linker adds these as an extension. NASM can generate GNU-compatible relocations, to allow 16-bit code to be linked as ELF using GNU ld. If NASM is used with the -w+gnu-elf-extensions option, a warning is issued when one of these relocations is generated.
Adding -w+gnu-elf-extensions
does indeed show a warning, but ld still gives the same error.
First of all I recommend you consider using an i686 ELF Cross compiler to avoid some gotchyas that can later bite you as you develop your kernel.
Using NASM with the -f bin output option
Nothing prevents you from using ELF as the object file type with NASM, but it is often simpler to use the -f bin
option that generates a fully resolved flat binary file that needs no fixups. It can be used as a boot sector without any linking step. The down side is that all the code has to be in the same. External assembler statement can be included with the %include
directive, similar to C's include
directive.
For this to work you have to place the origin point in the assembler file so that NASM knows what the base offset (origin point) is needed for generating absolute addresses (for labels etc). You would modify your assembly code and add this at the top:
[ORG 0x0000]
This only applies when using -f bin
output option, this directive will throw an error for other output types like -f elf
. In this case we use 0x0000 because the segment your code assumes is 0x07c0 which is moved into DS. 0x07c0:0x0000 maps to physical address (0x07c0<<4)+0x0000 = 0x07c00 which is where our bootloader will be loaded into memory.
If you don't specify [org 0x0000]
, then org = 0x0000 is the default when using the -f bin
output option, so it isn't actually necessary to specify it. It just makes it much clearer to a reader by using it explicitly.
In order assemble this into a binary file you could do:
nasm file.asm -fbin -o file.bin
This would output a flat binary file called file.bin
assembled from file.asm
.No linking step is needed.
Using NASM with the -f elf output option
In your example you are using ELF. There may be a couple reasons for doing it this way. Your generated binary file may be the combination of multiple object (.o
) files, or you may wish to generate debug symbols to be used with a debugger like GDB. Whatever your reason this can be done using these commands:
nasm file.asm -felf -o file.o
ld -melf_i386 -Ttext 0x0 -o file.bin file.o --oformat binary
-Ttext 0x0
would be the origin point that matches your code. 0x0000 in this case is the same value you would have used with the ORG
directive had you used NASM with the -f bin
output option. If you had written your code to assume an offset of 0x7c00 with code like:
xor ax, ax ; AX = 0
mov ds, ax ; DS = 0
Then the TEXT segment would have to be specified with:
ld -melf_i386 -Ttext 0x7c00 -o file.bin file.o --oformat binary
Your question may be: why do we need to explicitly set a value for the base of the TEXT segment? The reason is that the the default for LD is dependent on the the OS you are targeting (usually for the platform you are currently running on). If you are on Linux, by default LD will attempt to create output for Linux. On Linux the default for the start of the TEXT segment is usually 0x08048000
when specifying -m elf_i386
. This is of course a 32-bit value.
Any place an absolute address was needed it would attempt to add 0x08048000
(or potentially some other large address) to it. So an instruction like this:
mov si, hw
Would attempt to move the address of hw
into the 16-bit register SI. The linker would have attempted to resolve this to 0x08048000 + offset of hw
when creating the flat binary output file. Because you have a 32-bit value being used in an instruction that only takes a 16-bit value, you will get a warning/error. LD will truncate the 32-bit value to 16-bit, unfortunately that would likely produce an incorrect 16-bit address.
I found a solution here: Looking for 16-bit x86 compiler
something I learned the hard way;
-Ttext 0x0
is critical, otherwise the .text segment is pushed outside of 16bit addressing range (don't ask me why)