Creating a simple multiboot kernel loaded with gru

2019-01-24 12:26发布

问题:

I'm trying to follow the instructions here to build a simple OS kernel: http://mikeos.sourceforge.net/write-your-own-os.html

Except, instead of booting from a floppy, I want to create a grub-based ISO image and boot a multiboot CD in the emulator. I've added the following to the source listed at that page, for the multiboot header:

MBALIGN     equ  1<<0                   ; align loaded modules on page boundaries
MEMINFO     equ  1<<1                   ; provide memory map
FLAGS       equ  MBALIGN | MEMINFO      ; this is the Multiboot 'flag' field
MAGIC       equ  0x1BADB002             ; 'magic number' lets bootloader find the header
CHECKSUM    equ -(MAGIC + FLAGS)        ; checksum of above, to prove we are multiboot
section .multiboot
align 4
    dd MAGIC
    dd FLAGS
    dd CHECKSUM

and I'm doing the following to create the image:

nasm -felf32 -o init.bin  init.s
cp init.bin target/boot/init.bin
grub2-mkrescue -o init.iso target/

Then I run qemu to boot it:

qemu-system-x86_64 -cdrom ./init.iso 

After selecting 'myos' from the boot menu, I get the error

error: invalid arch-dependent ELF magic

What does that mean, and how can I fix it? I tried messing with the elf format, but only -felf32 seems to work...

回答1:

GRUB supports ELF32 and flat binaries. Your header though implicitly says that you are providing an ELF binary.

Using Flat Binary with Multiboot

If you wish to tell the Multiboot loader (GRUB) that you are using a flat binary you must set bit 16 to 1:

MULTIBOOT_AOUT_KLUDGE    equ  1 << 16
                              ;FLAGS[16] indicates to GRUB we are not
                              ;an ELF executable and the fields
                              ;header address,load address,load end address,
                              ;bss end address, and entry address will be
                              ;available in our Multiboot header

It isn't as simple as just specifying this flag. You must provide a complete Multiboot header that provides the Multiboot loader the information to load our binary into memory. When using ELF format this information is in the ELF header that precedes our code so didn't have to be explicitly provided. The Multiboot header is defined in the GRUB documentation in great detail.

When using NASM with -f bin it is important to note that we need to specify the origin point for our code. Multiboot loaders load our kernel at physical address 0x100000. We must specify in our assembler file that our origin point is 0x100000 so that proper offsets etc. will get generated in our final flat binary image.

This is an example stripped and modified from one of my own projects that provides a simple header. The call to _Main is set up like a C call in the example, but you don't have to do it that way. Usually I call into a function that takes a couple parameters on the stack (using C calling convention).

[BITS 32]
[global _start]
[ORG 0x100000]                ;If using '-f bin' we need to specify the
                              ;origin point for our code with ORG directive
                              ;multiboot loaders load us at physical 
                              ;address 0x100000

MULTIBOOT_AOUT_KLUDGE    equ  1 << 16
                              ;FLAGS[16] indicates to GRUB we are not
                              ;an ELF executable and the fields
                              ;header address, load address, load end address;
                              ;bss end address and entry address will be available
                              ;in Multiboot header
MULTIBOOT_ALIGN          equ  1<<0   ; align loaded modules on page boundaries
MULTIBOOT_MEMINFO        equ  1<<1   ; provide memory map

MULTIBOOT_HEADER_MAGIC   equ  0x1BADB002
                              ;magic number GRUB searches for in the first 8k
                              ;of the kernel file GRUB is told to load

MULTIBOOT_HEADER_FLAGS   equ  MULTIBOOT_AOUT_KLUDGE|MULTIBOOT_ALIGN|MULTIBOOT_MEMINFO
CHECKSUM                 equ  -(MULTIBOOT_HEADER_MAGIC + MULTIBOOT_HEADER_FLAGS)

KERNEL_STACK             equ  0x00200000  ; Stack starts at the 2mb address & grows down

_start:
        xor    eax, eax                ;Clear eax and ebx in the event
        xor    ebx, ebx                ;we are not loaded by GRUB.
        jmp    multiboot_entry         ;Jump over the multiboot header
        align  4                       ;Multiboot header must be 32
                                       ;bits aligned to avoid error 13
multiboot_header:
        dd   MULTIBOOT_HEADER_MAGIC    ;magic number
        dd   MULTIBOOT_HEADER_FLAGS    ;flags
        dd   CHECKSUM                  ;checksum
        dd   multiboot_header          ;header address
        dd   _start                    ;load address of code entry point
                                       ;in our case _start
        dd   00                        ;load end address : not necessary
        dd   00                        ;bss end address : not necessary
        dd   multiboot_entry           ;entry address GRUB will start at

multiboot_entry:
        mov    esp, KERNEL_STACK       ;Setup the stack
        push   0                       ;Reset EFLAGS
        popf

        push   eax                     ;2nd argument is magic number
        push   ebx                     ;1st argument multiboot info pointer
        call   _Main                   ;Call _Main 
        add    esp, 8                  ;Cleanup 8 bytes pushed as arguments

        cli
endloop:
        hlt
        jmp   endloop

_Main:  
        ret                            ; Do nothing

The Multiboot loader (GRUB) generally loads in the first 8k of your file (whether ELF or flat binary), looks for the Multiboot header on a 32 bit boundary. If bit 16 of the Multiboot header FLAG is clear, it assumes you are providing an ELF image. It then parses the ELF header to retrieve the information it needs to load your kernel file into memory. If bit 16 is set then a complete Multiboot header is required so that the loader has the information to read your kernel into memory, perform initialization,and then call into your kernel.

You would then assemble your init.s to a flat binary with something like:

nasm -f bin -o init.bin init.s

Using ELF with Multiboot

To tie in Jester's comments to your original question, you should have been able to boot with ELF and have it work, but it didn't because of one small detail. In your example you used this to make init.bin:

nasm -f elf32 -o init.bin  init.s

When using -f elf32, NASM generates object files (they aren't executable), that must be linked (with LD for example) to generate a final ELF(ELF32) executable. It would have probably worked if you had done the assemble and link processes with something like:

nasm -f elf32 init.s -o init.o 
ld -Ttext=0x100000 -melf_i386 -o init.bin init.o

Please note that when using -f elf32 you must remove the ORG directive from init.s. The ORG directive only applies when using -f bin. Multiboot loaders will load us at physical address 0x100000 so we must make sure that the assembled and linked code are generated with that origin point. When using -f elf32 we specify the entry point with -Ttext=0x100000 on the linker (LD) command line. Alternatively the origin point can be set in a linker script.

Using NASM/LD/OBJCOPY to Generate Flat Binary Images

It is possible to use NASM/LD/OBJCOPY together to produce a final flat binary image rather than using -f bin with NASM. If you remove the ORG directive from init.s and use these commands it should generate a flat binary init.bin:

nasm -f elf32 init.s -o init.o
ld -Ttext=0x100000 -melf_i386 -o init.elf init.o
objcopy -O binary init.elf init.bin 

In this, NASM is told to generate ELF32 objects. We assemble init.s into an ELF object file called init.o. We can then use the linker (LD) to generate an ELF executable from init.o called init.elf. We use a special program called objcopy to strip all the ELF headers off and generate a flat binary executable called init.bin.

This is a lot more involved than just using NASM with the -f bin option to generate the flat executable init.bin. Why bother then? With the method above you can tell NASM to generate debug information that can be utilized by gdb (the GNU debugger). If you attempt to use -g(enable debugging) with NASM using -f bin no debugging information gets generated. You can generate debug information by altering the assembly sequence this way:

nasm -g3 -F dwarf -f elf32 init.s -o init.o
ld -Ttext=0x100000 -melf_i386 -o init.elf init.o
objcopy -O binary init.elf init.bin

init.o will contain debug information (in dwarf format) that will be linked with LD into init.elf (which retains the debug information). Flat binaries don't contain debug information because they are stripped off when you use objcopy with -O binary. You can use init.elf if you enable the remote debugging facility in QEMU and use GDB for debugging. This debug info in init.elf provides information to the debugger that allows you to single step through your code, access variables and labels by name, see the source assembler code etc.

Besides generating debug information, there is another reason to use the NASM/LD/OBJCOPY process to generate a kernel binary. LD is much for configurable. LD allows a person to create linker scripts that allow you to better tune how things get laid out in the final binary. This can be useful for more complex kernels that may contain a mixture of code from different environments (C, Assembler etc). For a small toy kernel it may not be needed, but as a kernel grows in complexity the benefits of using a linker script will become more evident.

Remote debugging of QEMU with GDB

If you use the method in the previous section to generate debugging information inside an ELF executable (init.elf) you can launch QEMU and have it:

  • Load the QEMU environment and halt the CPU at startup. From man page:

    -S Do not start CPU at startup (you must type 'c' in the monitor).

  • Make QEMU listen for a GDB remote connection on localhost:1234 . From man page:

    -s Shorthand for -gdb tcp::1234, i.e. open a gdbserver on TCP port 1234.

Then you just have to launch GDB so that it:

  • Launches GDB with our ELF executable (init.elf) with debug symbols and information
  • Connects to localhost:1234 where QEMU is listening
  • Sets up the debug layout of your choice
  • Sets a break point to stop in our kernel (in this example multiboot_entry)

Here is an example of launching our kernel from the CD-ROM image init.iso, and launching GDB to connect to it:

qemu-system-x86_64 -cdrom ./init.iso -S -s &    
gdb init.elf \
        -ex 'target remote localhost:1234' \
        -ex 'layout src' \
        -ex 'layout regs' \
        -ex 'break multiboot_entry' \
        -ex 'continue'

You should be able to use GDB in much the same way as debugging a normal program. This assumes you will not be debugging a 16-bit program (kernel).

Important Considerations

As Jester points out, when using Multiboot compliant loaders like GRUB, the CPU is in 32-bit protected mode (not 16-bit real mode). Unlike booting right from the BIOS, you won't be able to use 16-bit code including most of the PC-BIOS interrupts. If you need to be in real mode you would have to change back to real mode manually, or create a VM86 task (the latter isn't trivial).

This is an important consideration since some of the code you linked to in MikeOS is 16-bit.