BIOS int 10h printing garbage on QEMU

2019-01-15 17:10发布

问题:

I have a problem while writing an x86 real mode assembly program that runs as a bootloader in QEMU. I'm trying to print text through BIOS interrupt 0x10. My code is:

print:
    pusha
.loop:
    mov AL, [SI]
    cmp AL, 0
    je .end
    call printChar
    inc SI
    jmp .loop
.end:
    popa
    ret

printChar:
    pusha
    mov AH, 0x0E
    mov BH, 0
    mov BL, 0x0F
    int 0x10
    popa
    ret

I'm using [ORG 0x7c00] as an origin point. I tested the printChar label and calling it with some letter in AL and it works fine. When I try to load a memory address to a message like this:

loadMsg      db "Loading",0
mov SI, loadMessage
call print

I get garbage like 'U' as output on QEMU emulator. Yesterday, I wrote a code really similar to this one and have no problem at all. What is causing my problem and how can it be fixed?

回答1:

I recently wrote some General Bootloader Tips in a Stackoverflow answer that may be of some use to you. Likely Tip #1 applies to your problem:

When the BIOS jumps to your code you can't rely on CS,DS,ES,SS,SP registers having valid or expected values. They should be set up appropriately when your bootloader starts. You can only be guaranteed that your bootloader will be loaded and run from physical address 0x00007c00 and that the boot drive number is loaded into the DL register.

Based on the fact that printChar works, and that writing an entire string out doesn't suggests that DS:SI is not pointing at the proper location in memory where your string resides. The usual cause of this is that developers incorrectly assume the CS and/or DS register is set appropriately when the BIOS jumps to the bootloader. It has to be manually set. In the case of an origin point of 0x7c00, DS needs to be set to 0. In 16-bit real mode physical memory addresses are computed from segment:offset pairs using the formula (segment<<4)+offset . In your case you are using an offset of 0x7C00. A value in DS of 0 would yield the correct physical address of (0<<4)+0x7c00 = 0x07c00 .

You can set DS to 0 at the start of your program with something like:

xor ax, ax       ; set AX to zero
mov ds, ax       ;     DS = 0  

In the case of QEMU, the BIOS jumps to 0x07c0:0x0000 . This also represents the same physical memory location (0x07c0<<4)+0 = 0x07c00 . Such a jump will set CS=0x07c0 (not CS=0). Since there are many segment:offset pairs that map to the same physical memory location, you need to set DS appropriately. You can't rely on CS being the value you expect. So in QEMU, code like this wouldn't even set DS correctly (when using ORG 0x7c00):

mov ax, cs
mov ds, ax       ; Make DS=CS

This may work on some emulators like DOSBOX and some physical hardware, but not all. The environments where this code would work is when the BIOS jumps to 0x0000:0x7c00 . In that case CS would be zero when it reaches your bootloader code, and copying CS to DS would work. Don't assume CS will be zero in all environments is the main point I am making. Always set the segment registers to what you want explicitly.

An example of code that should work is:

    BITS  16
    ORG   0x7c00
    GLOBAL main

main:
    xor ax, ax        ; AX = 0
    mov ds, ax        ; DS = 0
    mov bx, 0x7c00

    cli               ; Turn off interrupts for SS:SP update
                      ; to avoid a problem with buggy 8088 CPUs
    mov ss, ax        ; SS = 0x0000
    mov sp, bx        ; SP = 0x7c00
                      ; We'll set the stack starting just below
                      ; where the bootloader is at 0x0:0x7c00. The
                      ; stack can be placed anywhere in usable and
                      ; unused RAM.
    sti               ; Turn interrupts back on

    mov SI, loadMsg
    call print

    cli
.endloop:
    hlt
    jmp .endloop      ; When finished effectively put our bootloader
                      ; in endless cycle

print:
    pusha
.loop:
    mov AL, [SI]      ; No segment on [SI] means implicit DS:[SI]
    cmp AL, 0
    je .end
    call printChar
    inc SI
    jmp .loop
.end:
    popa
    ret

printChar:
    pusha
    mov AH, 0x0E
    mov BH, 0
    mov BL, 0x0F
    int 0x10
    popa
    ret

; Place the Data after our code
loadMsg db "Loading",0

times 510 - ($ - $$) db 0   ; padding with 0 at the end
dw 0xAA55                   ; PC boot signature


回答2:

This is a problem:

loadMsg    db "Loading",0
mov        SI, loadMessage
call    print

Unless the program jumps over the "loading" text, it executes whatever instructions which those bytes may (or may not) represent. Something like this fixes it:

    jmp      print_msg
    loadMsg    db "Loading",0
print_msg:
    mov        SI, loadMessage
    call    print