Assembly compiled executable on Bash on Ubuntu on

2019-01-09 16:55发布

问题:

I've been looking at a tutorial for assembly, and I'm trying to get a hello world program to run. I am using Bash on Ubuntu on Windows.

Here is the assembly:

section .text
    global _start     ;must be declared for linker (ld)

_start:             ;tells linker entry point
    mov edx,len     ;message length
    mov ecx,msg     ;message to write
    mov ebx,1       ;file descriptor (stdout)
    mov eax,4       ;system call number (sys_write)
    int 0x80        ;call kernel

    mov eax,1       ;system call number (sys_exit)
    int 0x80        ;call kernel

section .data
    msg db 'Hello, world!', 0xa  ;string to be printed
    len equ $ - msg     ;length of the string

I am using these commands to create the executable:

nasm -f elf64 hello.asm -o hello.o
ld -o hello hello.o -m elf_x86_64

And I run it using:

./hello

The program then seems to run without a segmentation fault or error, but it produces no output.

I can't figure out why the code won't produce an output, but I wonder if using Bash on Ubuntu on Windows has anything to do with it? Why doesn't it produce output and how can I fix it?

回答1:

The issue is with Ubuntu for Windows (Windows Subsystem for Linux). It only supports the 64-bit syscall interface and not the 32-bit x86 int 0x80 system call mechanism.

Besides not being able to use int 0x80 (32-bit compatibility) in 64-bit binaries, Ubuntu on Windows (WSL) doesn't support running 32-bit executables either.


You need to convert from using int 0x80 to syscall. It's not difficult. A different set of registers are used for a syscall and the system call numbers are different from their 32-bit counterparts. Ryan Chapman's blog has information on the syscall interface, the system calls, and their parameters. Sys_write and Sys_exit are defined this way:

%rax  System call  %rdi               %rsi              %rdx          %r10 %r8 %r9
----------------------------------------------------------------------------------
0     sys_read     unsigned int fd    char *buf         size_t count          
1     sys_write    unsigned int fd    const char *buf   size_t count
60    sys_exit     int error_code     

Using syscall also clobbers RCX and the R11 registers. They are considered volatile. Don't rely on them being the same value after the syscall.

Your code could be modified to be:

section .text
    global _start     ;must be declared for linker (ld)

_start:             ;tells linker entry point
    mov edx,len     ;message length
    mov rsi,msg     ;message to write
    mov edi,1       ;file descriptor (stdout)
    mov eax,edi     ;system call number (sys_write)
    syscall         ;call kernel

    xor edi, edi    ;Return value = 0
    mov eax,60      ;system call number (sys_exit)
    syscall         ;call kernel

section .data
    msg db 'Hello, world!', 0xa  ;string to be printed
    len equ $ - msg     ;length of the string

Note: in 64-bit code if the destination register of an instruction is 32-bit (like EAX, EBX, EDI, ESI etc) the processor zero extends the result into the upper 32-bits of the 64-bit register. mov edi,1 has the same effect as mov rdi,1.


This answer isn't a primer on writing 64-bit code, only about using the syscall interface. If you are interested in the nuances of writing code that calls the C library, and conforms to the 64-bit System V ABI there are reasonable tutorials to get you started like Ray Toal's NASM tutorial. He discusses stack alignment, the red zone, register usage, and a basic overview of the 64-bit System V calling convention.



回答2:

As already pointed out in comments by Ross Ridge, don't use 32-bit calling of kernel functions when you compile 64bit.

Either compile for 32bit or "translate" the code into 64 bit syscalls. Here is what that could look like:

section .text
    global _start     ;must be declared for linker (ld)

_start:             ;tells linker entry point
    mov rdx,len     ;message length
    mov rsi,msg     ;message to write
    mov rdi,1       ;file descriptor (stdout)
    mov rax,1       ;system call number (sys_write)
    syscall         ;call kernel

    mov rax,60      ;system call number (sys_exit)
    mov rdi,0       ;add this to output error code 0(to indicate program terminated without errors)
    syscall         ;call kernel

section .data
    msg db 'Hello, world!', 0xa  ;string to be printed
    len equ $ - msg     ;length of the string