I've written a simple assembly program:
section .data
str_out db "%d ",10,0
section .text
extern printf
extern exit
global main
main:
MOV EDX, ESP
MOV EAX, EDX
PUSH EAX
PUSH str_out
CALL printf
SUB ESP, 8 ; cleanup stack
MOV EAX, EDX
PUSH EAX
PUSH str_out
CALL printf
SUB ESP, 8 ; cleanup stack
CALL exit
I am the NASM assembler and the GCC to link the object file to an executable on linux.
Essentially, this program is first putting the value of the stack pointer into register EDX, it is then printing the contents of this register twice. However, after the second printf call, the value printed to the stdout does not match the first.
This behaviour seems strange. When I replace every usage of EDX in this program with EBX, the outputted integers are identical as expected. I can only infer that EDX is overwritten at some point during the printf function call.
Why is this the case? And how can I make sure that the registers I use in future don't conflict with C lib functions?
According to the x86 ABI, EBX
, ESI
, EDI
, and EBP
are callee-save registers and EAX
, ECX
and EDX
are caller-save registers.
It means that functions can freely use and destroy previous values EAX
, ECX
, and EDX
.
For that reason, save values of EAX
, ECX
, EDX
before calling functions if you don't want their values to change. It is what "caller-save" mean.
Or better, use other registers for values that you're still going to need after a function call. push/pop of EBX
at the start/end of a function is much better than push/pop of EDX
inside a loop that makes a function call. When possible, use call-clobbered registers for temporaries that aren't needed after the call. Values that are already in memory, so they don't need to written before being re-read, are also cheaper to spill.
Since EBX
, ESI
, EDI
, and EBP
are callee-save registers, functions have to restore the values to the original for any of those they modify, before returning.
ESP
is also callee-saved, but you can't mess this up unless you copy the return address somewhere. Mismatched call/ret is terrible for performance because modern CPUs use a return-address predictor.
The ABI for the target platform (e.g. 32bit x86 Linux) defines which registers can be used by functions without saving. (i.e., if you want them preserved across a call, you have to do it yourself).
Links to ABI docs for Windows and non-Window, 32 and 64bit, at https://stackoverflow.com/tags/x86/info
Having some registers that aren't preserved across calls (available as scratch registers) means functions can be smaller. Simple functions can often avoid doing any push/pop
save/restores. This cuts down on the number of instructions, leading to faster code.
It's important to have some of each: having to spill all state to memory across calls would bloat the code of non-leaf functions, and slow things down esp. in cases where the called function didn't touch all the registers.