Why doesn't NASM have trouble with valid instr

2019-03-05 11:07发布

问题:

I wrote the following simple program, but nasm refuses to compile it.

section .text
    global _start

_start:
    mov rax, 0x01 
    mov rdi, 0x01
    mov rsi, str
    mov rdx, 0x03
    syscall

    mov rax, 60
    syscall

segment .data
    str db 'Some string'


nasm -f elf64 main.asm
main.asm:15: error: comma, colon, decorator or end of line expected after operand

As I read in this answer this is because str is an instruction mnemonic. So I added a colon to str and now it compiles fine. But what about the line

mov rsi, str

str is an instruction mnemonic here, but it still compiles fine. Why?

回答1:

As the NASM manual explains, other than macro definitions and directives, the format of a NASM source line has some combination of these four fields:

label:    instruction operands        ; comment

After it sees mov as the mnemonic, it's no longer considering the remaining tokens as possible instruction mnemonics. Assembly language is strictly one instruction per statement.

If you wanted the bytes that encode an str ax instruction as the immediate operand for mov-sign-extended-imm32, you'd have to do that yourself with a numeric constant. NASM syntax doesn't have a way to do that for you, so its parser doesn't need to recurse into the operands once its found a mnemonic.


Or instead of encoding str manually, use a db to emit the bytes of the mov instruction.

db 0x48, 0xc7, 0xc6    ; REX.W prefix, opcode for mov r/m64,imm32,  ModR/M = rsi destination
      str  [rax+1]     ; a disp8 makes this 4 bytes long.


;; same machine code as
mov rsi, strict dword 0x0148000f    ; str [rax+1]

;; nasm optimizes it to mov esi, imm32 without strict dword.
;; I guess I should have used that 5-byte instruction for the DB version...