Calling a C function that takes no parameters with parameters
I have some strange question about possible undefined behavior between C call and 64/32 bit compilation. First, here's my code:
int f() { return 0; }
int main()
{
int x = 42;
return f(x);
}
As you can see, I am calling f with an argument and f takes no parameters. My first question was whether this argument is actually assigned to f when it is called.
Mysterious lines
After a little objdump, I got some curious results. By passing x as an argument to f:
00000000004004b6 <f>:
4004b6: 55 push %rbp
4004b7: 48 89 e5 mov %rsp,%rbp
4004ba: b8 00 00 00 00 mov $0x0,%eax
4004bf: 5d pop %rbp
4004c0: c3 retq
00000000004004c1 <main>:
4004c1: 55 push %rbp
4004c2: 48 89 e5 mov %rsp,%rbp
4004c5: 48 83 ec 10 sub $0x10,%rsp
4004c9: c7 45 fc 2a 00 00 00 movl $0x2a,-0x4(%rbp)
4004d0: 8b 45 fc mov -0x4(%rbp),%eax
4004d3: 89 c7 mov %eax,%edi
4004d5: b8 00 00 00 00 mov $0x0,%eax
4004da: e8 d7 ff ff ff callq 4004b6 <f>
4004df: c9 leaveq
4004e0: c3 retq
4004e1: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
4004e8: 00 00 00
4004eb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
Without passing x as an argument:
00000000004004b6 <f>:
4004b6: 55 push %rbp
4004b7: 48 89 e5 mov %rsp,%rbp
4004ba: b8 00 00 00 00 mov $0x0,%eax
4004bf: 5d pop %rbp
4004c0: c3 retq
00000000004004c1 <main>:
4004c1: 55 push %rbp
4004c2: 48 89 e5 mov %rsp,%rbp
4004c5: 48 83 ec 10 sub $0x10,%rsp
4004c9: c7 45 fc 2a 00 00 00 movl $0x2a,-0x4(%rbp)
4004d0: b8 00 00 00 00 mov $0x0,%eax
4004d5: e8 dc ff ff ff callq 4004b6 <f>
4004da: c9 leaveq
4004db: c3 retq
4004dc: 0f 1f 40 00 nopl 0x0(%rax)
So, as we can see:
4004d0: 8b 45 fc mov -0x4(%rbp),%eax
4004d3: 89 c7 mov %eax,%edi
happens when I call f with x, but since I'm not very good with assembly, I don't really understand these lines.
64/32 bit paradox
Otherwise, I tried something else and started printing the stack of my program.
The stack with x given to f (compiled to 64 bits):
Address of x: ffcf115c
ffcf1128: 0 0
ffcf1130: -3206820 0
ffcf1138: -3206808 134513826
ffcf1140: 42 -3206820
ffcf1148: -145495616 134513915
ffcf1150: 1 -3206636
ffcf1158: -3206628 42
ffcf1160: -143903780 -3206784
Stack with x not assigned to f (compiled to 64 bit):
Address of x: 3c19183c
3c191818: 0 0
3c191820: 1008277568 32766
3c191828: 4195766 0
3c191830: 1008277792 32766
3c191838: 0 42
3c191840: 4195776 0
And for some reason, in 32 bits, x seems to be pushed onto the stack.
Stack with x assigned to f (compiled to 32 bits):
Address of x: ffdc8eac
ffdc8e78: 0 0
ffdc8e80: -2322772 0
ffdc8e88: -2322760 134513826
ffdc8e90: 42 -2322772
ffdc8e98: -145086016 134513915
ffdc8ea0: 1 -2322588
ffdc8ea8: -2322580 42
ffdc8eb0: -143494180 -2322736
Why the hell does x appear in 32 but not 64 ???
Print code: http://paste.awesom.eu/yayg/QYw6&ln
Why am I asking such stupid questions?
- First, because I have not found any standard that answers my question.
- Second, think about calling a variadic function in C without counting arguments.
- Last but not least, I think undefined behavior is fun.
Thank you for reading so far and helping me understand something or making me realize that my questions are meaningless.
source to share
The answer is that, as you suspect, what you are doing is undefined behavior (in case of passing an extra argument).
However, the actual behavior is harmless in many implementations. The argument is prepared on the stack and ignored by the called function. The called function is not responsible for removing arguments from the stack, so there is no harm (such as an unbalanced stack pointer).
This innocuous behavior was what allowed C hackers to create once and for all a variable argument list object that was previously found #include <varargs.h>
in ancient versions of the Unix C library.
It turned into the C the ANSI <stdarg.h>
.
The idea was this: pass additional arguments to the function, and then dynamically walk the stack to get them.
It won't work today. For example, as you can see, the parameter is not actually pushed onto the stack, but rather loaded into a register RDI
. This is the convention used by GCC on x86-64. If you walk through the stack, you won't find the first few parameters. On IA-32, GCC passes parameters using the stack as opposed to this: although you can get register-based behavior using the "fastcall" convention.
The va_arg
from macro <stdarg.h>
will correctly respect the mixed case / stack parameter transition convention. (Or rather, when you use the correct declaration for a function variable, it will probably suppress the passing of returning arguments in registers, so it va_arg
might just pass through memory.)
PS your machine code might be easier to track if you've added some optimization. For example, the sequence
4004c9: c7 45 fc 2a 00 00 00 movl $0x2a,-0x4(%rbp)
4004d0: 8b 45 fc mov -0x4(%rbp),%eax
4004d3: 89 c7 mov %eax,%edi
4004d5: b8 00 00 00 00 mov $0x0,%eax
is rather dumb due to what looks like wasteful data.
source to share
How arguments are passed to a function depends on the platform ABI (Application Binary Interface). The ABI allows libraries to be compiled with the X compiler and used with code compiled with the Y compiler. None of this is defined by the standard.
The standard does not require a "stack" to exist, much less that it was used to call a function.
The x86 chips had a limited number of registers, and the ABI reflects this fact; the usual 32 bit x86 calling convention uses a stack for all arguments.
This is not the case for 64-bit architecture, which has many more registers and uses some of them for the first few parameters. This makes function calls much faster.
Similarly, Windows' 32-bit "fastcall" calling convention passes multiple arguments to registers. (To use a non-standard calling convention, you need to annotate the function declaration appropriately and do so consistently where it is defined.)
More information on the various calling conventions can be found in the Wikipedia article . AMD64 ABI can be found at x86-64.org (PDF document) . The original V IA-32 ABI (the base ABI used in Linux, xBSD, and OS X) is still available from www.sco.com (PDF document) .
Undefined behavior?
The code presented in the OP is definitely undefined.
-
In a function definition, an empty parameter list means that the function does not take any arguments. In a function declaration, an empty parameter cannot declare how many arguments the function takes.
& sect; 6.7.6.3/p.14: An empty list in a function declaration that is part of that function's definition indicates that the function has no parameters . An empty list in a function declaration that is not part of that function's definition indicates that there is no information about the number or types of parameters.
-
When the function is eventually called, it must be called with the correct number of parameters:
& sect; 6.5.2.2/p.6: If the expression denoting the called function is of a type that does not include a prototype, then integer promotions are performed for each argument, and arguments of type float are promoted to double ... If the number of arguments is not equal to the number of parameters, the behavior is undefined.
-
If the function is defined as a vararg function (with a trailing ellipsis), the vararg declaration must be visible wherever the function is called.
(Continuing the previous quote): If a function is defined with a type that includes a prototype, and either the prototype ends with an ellipsis (, ...) or the argument types after a promotion are incompatible with the parameter types, the behavior is undefined.
source to share