I'm writing a little kernel in assembler. I'm running it in QEMU and have some problems with some bugs. Now I want to debug the kernel with dbg. So I assembled it like so:
$ nasm -g -f elf -o myos.elf myos.asm
$ objcopy --only-keep-debug myos.elf myos.sym
$ objcopy -O binary myos.elf myos.bin
Then I run it in QEMU with:
$ qemu-system-i386 -s -S myos.bin
Then I connect with gdb:
$ gdb
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
0x0000fff0 in ?? ()
symbol-file myos.sym
Reading symbols from /home/sven/Projekte/myos/myos.sym...done.
I have a label named welcome
in my kernel that points to a string. While testing I tried to look at that string, which gave the following result:
(gdb) x/32b welcome
0x1e <welcome>: 0x00 0xf0 0xa5 0xfe 0x00 0xf0 0x87 0xe9
0x26: 0x00 0xf0 0x6e 0xc9 0x00 0xf0 0x6e 0xc9
0x2e: 0x00 0xf0 0x6e 0xc9 0x00 0xf0 0x6e 0xc9
0x36: 0x00 0xf0 0x57 0xef 0x00 0xf0 0x6e
The label is defined like this:
welcome: db "System started. Happy hacking!", 10, 0
So you can see, gdb is pretending welcome starts with a null byte but by definition it's not. However the kernel uses the label correctly, so it doesn't seem like a poblem with my code. Examining other parts of the memory doesn't match the loaded kernel at all.
Does anyone know why the memory of the virtual machine doesn't match the loaded kernel, while the machine still behaves corectly?
Explanation
qemu-system-i386
loads the first byte of an x86 boot sector image file at address 0x7c00
at run time.
- Your ELF files (
myos.elf
, myos.sym
) mistakenly inform GDB that the code would be loaded at address 0. Thus GDB thinks welcome
is at 0x1e
while it's actually at 0x7c1e
.
- Adding
0x7c00
to all addresses in GDB would work but is clumsy: x/32xb (welcome + 0x7c00)
- A better solution is to create an ELF file with the right addresses.
Solution
boot.asm
; 'boot.asm'
; loaded by BIOS
[bits 16]
global main
main:
mov di, welcome
print_welcome:
mov ah, 0x0e
mov al, [di]
int 0x10
inc di
cmp byte [di], 0
jne print_welcome
hlt
db "XXXXXXXXXXXXXX" ; some padding to make welcome appear at 0x1e
welcome: db "System started. Happy hacking!", 10, 0
; x86 boot sector padding and signature
; NOTE: intentionally commented out. Will be added by linker script
;times 510 - ($ - $$) db 0x00
;db 0x55, 0xAA
x86-boot.ld
ENTRY(main);
SECTIONS
{
. = 0x7C00;
.text : AT(0x7C00)
{
_text = .;
*(.text);
_text_end = .;
}
.data :
{
_data = .;
*(.bss);
*(.bss*);
*(.data);
*(.rodata*);
*(COMMON)
_data_end = .;
}
.sig : AT(0x7DFE)
{
SHORT(0xaa55);
}
/DISCARD/ :
{
*(.note*);
*(.iplt*);
*(.igot*);
*(.rel*);
*(.comment);
/* add any unwanted sections spewed out by your version of gcc and flags here */
}
}
Build the code with:
nasm -g -f elf -F dwarf boot.asm -o boot.o
cc -nostdlib -m32 -T x86-boot.ld -Wl,--build-id=none boot.o -o boot
objcopy -O binary boot boot.good.bin
dump-welcome.gdb
target remote localhost:1234
symbol-file boot
monitor system_reset
# run until hlt instruction, address obtained through disassembly
until *0x7c0f
x/32xb welcome
monitor quit
disconnect
quit
Sample session:
$ qemu-system-x86_64 -s -S boot.good.bin &
$ gdb -q -x dump-welcome.gdb
0x0000fff0 in ?? ()
main () at boot.asm:16
16 hlt
0x7c1e : 0x53 0x79 0x73 0x74 0x65 0x6d 0x20 0x73
0x7c26: 0x74 0x61 0x72 0x74 0x65 0x64 0x2e 0x20
0x7c2e: 0x48 0x61 0x70 0x70 0x79 0x20 0x68 0x61
0x7c36: 0x63 0x6b 0x69 0x6e 0x67 0x21 0x0a 0x00
Thought Process
Most of the 32 bytes you dumped have values ≥ 0x80, i.e. they're not printable ASCII characters. This raises the question: Am I really dumping the right address?
The hex dump of your welcome
message should be:
$ python -c 's = "System started. Happy hacking!"; print [hex(ord(x)) for x in s ]'
['0x53', '0x79', '0x73', '0x74', '0x65', '0x6d', '0x20', '0x73', '0x74', '0x61', '0x72', '0x74', '0x65', '0x64', '0x2e', '0x20', '0x48', '0x61', '0x70', '0x70', '0x79', '0x20', '0x68', '0x61', '0x63', '0x6b', '0x69', '0x6e', '0x67', '0x21']
Using GDB to search for the welcome
message in memory would have revealed the right address as well:
(gdb) find 0, 0xffff, 'S', 'y', 's', 't'
0x7c1e
Further Reading
- GNU LD: Basic Linker Script Concepts: see discussion on LMA vs. VMA
- Real mode in C with gcc : writing a bootloader: source of the linker script above. Shows some cool GNU toolchain tricks for x86 real mode development.