I have some weird question about probably undefined behavior between C calling convention and 64/32 bits compilation. First here is my code:
int f() { return 0; }
int main()
{
int x = 42;
return f(x);
}
As you can see I am calling f with an argument while f takes no parameters. My first question was does this argument is really given to f while calling it.
The mysterious lines
After a little objdump I obtained curious results. While passing x as argument of f:
00000000004004b6 <f>:
4004b6: 55 push %rbp
4004b7: 48 89 e5 mov %rsp,%rbp
4004ba: b8 00 00 00 00 mov $0x0,%eax
4004bf: 5d pop %rbp
4004c0: c3 retq
00000000004004c1 <main>:
4004c1: 55 push %rbp
4004c2: 48 89 e5 mov %rsp,%rbp
4004c5: 48 83 ec 10 sub $0x10,%rsp
4004c9: c7 45 fc 2a 00 00 00 movl $0x2a,-0x4(%rbp)
4004d0: 8b 45 fc mov -0x4(%rbp),%eax
4004d3: 89 c7 mov %eax,%edi
4004d5: b8 00 00 00 00 mov $0x0,%eax
4004da: e8 d7 ff ff ff callq 4004b6 <f>
4004df: c9 leaveq
4004e0: c3 retq
4004e1: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
4004e8: 00 00 00
4004eb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
Without passing x as a argument:
00000000004004b6 <f>:
4004b6: 55 push %rbp
4004b7: 48 89 e5 mov %rsp,%rbp
4004ba: b8 00 00 00 00 mov $0x0,%eax
4004bf: 5d pop %rbp
4004c0: c3 retq
00000000004004c1 <main>:
4004c1: 55 push %rbp
4004c2: 48 89 e5 mov %rsp,%rbp
4004c5: 48 83 ec 10 sub $0x10,%rsp
4004c9: c7 45 fc 2a 00 00 00 movl $0x2a,-0x4(%rbp)
4004d0: b8 00 00 00 00 mov $0x0,%eax
4004d5: e8 dc ff ff ff callq 4004b6 <f>
4004da: c9 leaveq
4004db: c3 retq
4004dc: 0f 1f 40 00 nopl 0x0(%rax)
So as we can see:
4004d0: 8b 45 fc mov -0x4(%rbp),%eax
4004d3: 89 c7 mov %eax,%edi
happen when I call f with x but because I am not really good in assembly I don't really understand these lines.
The 64/32 bits paradoxe
Otherwise I tried something else and start printing the stack of my program.
Stack with x given to f (compiled in 64bits):
Address of x: ffcf115c
ffcf1128: 0 0
ffcf1130: -3206820 0
ffcf1138: -3206808 134513826
ffcf1140: 42 -3206820
ffcf1148: -145495616 134513915
ffcf1150: 1 -3206636
ffcf1158: -3206628 42
ffcf1160: -143903780 -3206784
Stack with x not given to f (compiled in 64bits):
Address of x: 3c19183c
3c191818: 0 0
3c191820: 1008277568 32766
3c191828: 4195766 0
3c191830: 1008277792 32766
3c191838: 0 42
3c191840: 4195776 0
And for some reason in 32bits x seems to be push on the stack.
Stack with x given to f (compiled in 32bits):
Address of x: ffdc8eac
ffdc8e78: 0 0
ffdc8e80: -2322772 0
ffdc8e88: -2322760 134513826
ffdc8e90: 42 -2322772
ffdc8e98: -145086016 134513915
ffdc8ea0: 1 -2322588
ffdc8ea8: -2322580 42
ffdc8eb0: -143494180 -2322736
Why the hell does x appear in 32 but not 64 ???
Code for printing: http://paste.awesom.eu/yayg/QYw6&ln
Why am I asking such stupid questions ?
- First because I didn't found any standard that answer to my question
- Secondly, think about calling a variadic function in C without the count of arguments given.
- Last but not least, I think undefined behavior is fun.
Thank you for taking the time to read until here and for helping me understanding something or making me realize that my questions are pointless.
The answer is that, as you suspect, what you are doing is undefined behavior (in the case where the superfluous argument is passed).
The actual behavior in many implementations is harmless, however. An argument is prepared on the stack, and is ignored by the called function. The called function is not responsible for removing arguments from the stack, so there no harm (such as an unbalanced stack pointer).
This harmless behavior was what enabled C hackers to develop, once upon a time, a variable argument list facility that used to be under
#include <varargs.h>
in ancient versions of the Unix C library.This evolved into the ANSI C
<stdarg.h>
.The idea was: pass extra arguments into a function, and then march through the stack dynamically to retrieve them.
That won't work today. For instance, as you can see, the parameter is not in fact put into the stack, but loaded into the
RDI
register. This is the convention used by GCC on x86-64. If you march through the stack, you won't find the first several parameters. On IA-32, GCC passes parameters using the stack, by contrast: though you can get register-based behavior with the "fastcall" convention.The
va_arg
macro from<stdarg.h>
will correctly take into account the mixed register/stack parameter passing convention. (Or, rather, when you use the correct declaration for a variadic function, it will perhaps suppress the passage of the trailing arguments in registers, so thatva_arg
can just march through memory.)P.S. your machine code might be easier to follow if you added some optimization. For instance, the sequence
is fairly obtuse due to what look like some wasteful data moves.
How arguments are passed to a function is dependent on the platform ABI (application binary interface). The ABI makes it possible to compile libraries with compiler X and use them with code compiled with compiler Y. None of this is defined by the standard.
There is no requirement by the standard that a "stack" even exist, much less that it be used for function calling.
The x86 chips had limited numbers of registers, and the ABI reflects that fact; the normal 32-bit x86 calling convention uses the stack for all arguments.
That is not the case with the 64-bit architecture, which has many more registers and uses some of them for the first few parameters. This significantly speeds up function calls.
Similarly, the Windows 32-bit "fastcall" calling convention passes a few arguments in registers. (In order to use a non-standard calling convention, you need to appropriately annotate the function declaration, and do so consistently where it is defined.)
You can find more information on various calling conventions in this Wikipedia article. The AMD64 ABI can be found on x86-64.org (PDF document). The original System V IA-32 ABI (the basis of the ABI used on Linux, xBSD and OS X) can still be accessed from www.sco.com (PDF document).
Undefined behaviour?
The code presented in the OP is definitely undefined behaviour.
In a function definition, an empty parameter list means that the function does not take any arguments. In a function declaration, an empty parameter fails to declare how many arguments the function takes.
When the function is eventually called, it must be called with the correct number of parameters:
If the function is defined as a vararg function (with a trailing ellipsis), the vararg declaration must be visible wherever the function is called.