I'm currently solving problem 3.3 from 3rd edition of Computer System: a programmer's perspective and I'm having a hard time understanding what these errors mean...
movb $0xF, (%ebx)
gives an error because ebx can't be used as address register
movl %rax, (%rsp)
and
movb %si, 8(%rbp)
gives error saying that theres a mismatch between instruction suffix and register I.D.
movl %eax, %rdx
gives an error saying that destination operand incorrect size
why can't we use ebx as address register? Is it because its 32-bit register? Would the following line work if it was movb $0xF, (%rbx)
instead? since rbx is of 64bit register?
for the error regarding mismatch between instruction suffix and register I.D, does this error appear because it should've been movq %rax, (%rsp)
and movew %si, 8(%rbp)
instead of movl %rax, (%rsp)
and movb %si, 8(%rbp)
?
and lastly, for the error regarding "destination operand incorrect size", is this because the destination register was 64 bit instead of 32? so if the line of code was movl %eax, %edx
instead, the error wouldn't have occurred?
any enlightenment would be appreciated.
this is for x86-64
movb $0xF, (%ebx) gives an error because ebx can't be used as address register
It's true that ebx
can't be used as an address register (for x86-64), but rbx
can. ebx is the lower 32bits of rbx. The whole point of 64bit code is that addresses can be 64bits, so trying to reference memory by using a 32bit register makes little sense.
movl %rax, (%rsp) and movb %si, 8(%rbp) gives error saying that
theres a mismatch between instruction suffix and register I.D.
Yes, because you are using movl
, the 'l' means long, which (in this context) means 32bits. However, rax is a 64bit register. If you want to write 64bits out of rax, you should use movq
. If you want to write 32bits, you should use eax
.
movl %eax, %rdx gives an error saying that destination operand incorrect size
You are trying to move a 32bit value into a 64bit register. There are instructions to do this conversion for you (see cdq
for example), but movl isn't one of them.
movb $0xF, (%ebx)
assembles just fine (with a 0x67
address-size prefix), and executes correctly if the address in ebx
is valid.
It might be a bug (and e.g. lead to a segfault from truncating a pointer), or sub-optimal, but if your book makes any stronger claim than that (like that it won't assemble) then your book contains an error.
The only reason you'd ever use that instead of movb $0xF, (%rbx)
is if the upper bytes of %rbx
potentially held garbage, e.g. in the x32 ABI (ILP32 in long mode), or if you're a dumb compiler that always uses address-size prefixes when targeting 32-bit-pointer mode even when addresses are known to be safely zero-extended.
32-bit address size is actually useful for the x32 ABI for the more common case where an index register holds high garbage, e.g. movl $0x12345, (%edi, %esi,4)
.
gcc -mx32
could easily emit a movb $0xF, (%ebx)
instruction in real life. (Note that -mx32
(32-bit pointers in long mode) is different from -m32
(i386 ABI))
int ext(); // can't inline
void foo(char *p) {
ext(); // clobbers arg-passing registers
*p = 0xf; // so gcc needs to save the arg for after the call
}
Compiles with gcc7.3 -mx32 -O3
on the Godbolt compiler explorer into
foo(char*):
pushq %rbx # rbx is gcc's first choice of call-preserved reg.
movq %rdi, %rbx # stupid gcc copies the whole 64 bits when only the low 32 are useful
call ext()
movb $15, (%ebx) # $15 = $0xF
popq %rbx
ret
mov $edi, %ebx
would have been better; IDK why gcc wants to copy the whole 64-bit register when it's treating pointers as 32-bit values. The x32 ABI unfortunately never really caught on on x86 so I guess nobody's put in the time to get gcc to generate great code for it.
AArch64 also has an ILP32 ABI to save memory / cache-footprint on pointer data, so maybe gcc will get better at 32-bit pointers in 64-bit mode in general (benefiting x86-64 as well) if any work for AArch64 ILP32 improves the common cross-architecture parts of this.
so if the line of code was movl %eax, %edx instead, the error wouldn't have occurred?
Right, that would zero-extend EAX into RDX. If you wanted to sign-extend EAX into RDX, use movslq %eax, %rdx
(aka Intel-syntax movsxd
)
(Almost) all x86 instructions require all their operands to be the same size. (In terms of operand-size; many instructions have a form with an 8-bit or 32-bit immediate that's sign extended to 64-bit or whatever the instruction's operand-size is. e.g. add $1, %eax
will use the 3-byte add imm8, r/m32
form.)
Exceptions include shl %cl, %eax
, and movzx/movsx.
In AT&T syntax, the sizes of registers have to match the operand-size suffix, if you use one. If you don't, the registers imply an operand-size. e.g. mov %eax, %edx
is the same as movl
.
Memory + immediate instructions with no register source or destination need an explicit size: add $1, (%rdx)
won't assemble because the operand-size is ambiguous, but add %eax, (%rdx)
is an addl
(32-bit operand-size).
movew %si, 8(%rbp)
No, movw %si, 8(%rbp)
would work though :P But note that if you've made a traditional stack frame with push %rbp
/ mov %rsp, %rbp
on function entry, that store to 8(%rbp)
will overwrite the low 16 bits of your return address on the stack.
But there's no requirement in x86-64 code for Windows or Linux that you have %rbp
pointing there, or holding a valid pointer at all. It's just a call-preserved register like %rbx
that you can use for whatever you want as long as you restore the caller's value before returning.