Calling a C function that takes no parameters with parameters

I have some strange question about possible undefined behavior between C call and 64/32 bit compilation. First, here's my code:

int f() { return 0; }

int main()
{
    int x = 42;
    return f(x);
}

      

As you can see, I am calling f with an argument and f takes no parameters. My first question was whether this argument is actually assigned to f when it is called.

Mysterious lines

After a little objdump, I got some curious results. By passing x as an argument to f:

00000000004004b6 <f>:
  4004b6:   55                      push   %rbp
  4004b7:   48 89 e5                mov    %rsp,%rbp
  4004ba:   b8 00 00 00 00          mov    $0x0,%eax
  4004bf:   5d                      pop    %rbp
  4004c0:   c3                      retq   

00000000004004c1 <main>:
  4004c1:   55                      push   %rbp
  4004c2:   48 89 e5                mov    %rsp,%rbp
  4004c5:   48 83 ec 10             sub    $0x10,%rsp
  4004c9:   c7 45 fc 2a 00 00 00    movl   $0x2a,-0x4(%rbp)
  4004d0:   8b 45 fc                mov    -0x4(%rbp),%eax
  4004d3:   89 c7                   mov    %eax,%edi
  4004d5:   b8 00 00 00 00          mov    $0x0,%eax
  4004da:   e8 d7 ff ff ff          callq  4004b6 <f>
  4004df:   c9                      leaveq 
  4004e0:   c3                      retq   
  4004e1:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  4004e8:   00 00 00 
  4004eb:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

      

Without passing x as an argument:

00000000004004b6 <f>:
  4004b6:   55                      push   %rbp
  4004b7:   48 89 e5                mov    %rsp,%rbp
  4004ba:   b8 00 00 00 00          mov    $0x0,%eax
  4004bf:   5d                      pop    %rbp
  4004c0:   c3                      retq   

00000000004004c1 <main>:
  4004c1:   55                      push   %rbp
  4004c2:   48 89 e5                mov    %rsp,%rbp
  4004c5:   48 83 ec 10             sub    $0x10,%rsp
  4004c9:   c7 45 fc 2a 00 00 00    movl   $0x2a,-0x4(%rbp)
  4004d0:   b8 00 00 00 00          mov    $0x0,%eax
  4004d5:   e8 dc ff ff ff          callq  4004b6 <f>
  4004da:   c9                      leaveq 
  4004db:   c3                      retq   
  4004dc:   0f 1f 40 00             nopl   0x0(%rax)

      

So, as we can see:

  4004d0:   8b 45 fc                mov    -0x4(%rbp),%eax
  4004d3:   89 c7                   mov    %eax,%edi

      

happens when I call f with x, but since I'm not very good with assembly, I don't really understand these lines.

64/32 bit paradox

Otherwise, I tried something else and started printing the stack of my program.

The stack with x given to f (compiled to 64 bits):

Address of x: ffcf115c
  ffcf1128:          0          0
  ffcf1130:   -3206820          0
  ffcf1138:   -3206808  134513826
  ffcf1140:         42   -3206820
  ffcf1148: -145495616  134513915
  ffcf1150:          1   -3206636
  ffcf1158:   -3206628         42
  ffcf1160: -143903780   -3206784

      

Stack with x not assigned to f (compiled to 64 bit):

Address of x: 3c19183c
  3c191818:          0          0
  3c191820: 1008277568      32766
  3c191828:    4195766          0
  3c191830: 1008277792      32766
  3c191838:          0         42
  3c191840:    4195776          0

      

And for some reason, in 32 bits, x seems to be pushed onto the stack.

Stack with x assigned to f (compiled to 32 bits):

Address of x: ffdc8eac
  ffdc8e78:          0          0
  ffdc8e80:   -2322772          0
  ffdc8e88:   -2322760  134513826
  ffdc8e90:         42   -2322772
  ffdc8e98: -145086016  134513915
  ffdc8ea0:          1   -2322588
  ffdc8ea8:   -2322580         42
  ffdc8eb0: -143494180   -2322736

      

Why the hell does x appear in 32 but not 64 ???

Print code: http://paste.awesom.eu/yayg/QYw6&ln

Why am I asking such stupid questions?

  • First, because I have not found any standard that answers my question.
  • Second, think about calling a variadic function in C without counting arguments.
  • Last but not least, I think undefined behavior is fun.

Thank you for reading so far and helping me understand something or making me realize that my questions are meaningless.

+3


source to share


2 answers


The answer is that, as you suspect, what you are doing is undefined behavior (in case of passing an extra argument).

However, the actual behavior is harmless in many implementations. The argument is prepared on the stack and ignored by the called function. The called function is not responsible for removing arguments from the stack, so there is no harm (such as an unbalanced stack pointer).

This innocuous behavior was what allowed C hackers to create once and for all a variable argument list object that was previously found #include <varargs.h>

in ancient versions of the Unix C library.

It turned into the C the ANSI <stdarg.h>

.

The idea was this: pass additional arguments to the function, and then dynamically walk the stack to get them.



It won't work today. For example, as you can see, the parameter is not actually pushed onto the stack, but rather loaded into a register RDI

. This is the convention used by GCC on x86-64. If you walk through the stack, you won't find the first few parameters. On IA-32, GCC passes parameters using the stack as opposed to this: although you can get register-based behavior using the "fastcall" convention.

The va_arg

from macro <stdarg.h>

will correctly respect the mixed case / stack parameter transition convention. (Or rather, when you use the correct declaration for a function variable, it will probably suppress the passing of returning arguments in registers, so it va_arg

might just pass through memory.)

PS your machine code might be easier to track if you've added some optimization. For example, the sequence

  4004c9:   c7 45 fc 2a 00 00 00    movl   $0x2a,-0x4(%rbp)
  4004d0:   8b 45 fc                mov    -0x4(%rbp),%eax
  4004d3:   89 c7                   mov    %eax,%edi
  4004d5:   b8 00 00 00 00          mov    $0x0,%eax

      

is rather dumb due to what looks like wasteful data.

+3


source


How arguments are passed to a function depends on the platform ABI (Application Binary Interface). The ABI allows libraries to be compiled with the X compiler and used with code compiled with the Y compiler. None of this is defined by the standard.

The standard does not require a "stack" to exist, much less that it was used to call a function.

The x86 chips had a limited number of registers, and the ABI reflects this fact; the usual 32 bit x86 calling convention uses a stack for all arguments.

This is not the case for 64-bit architecture, which has many more registers and uses some of them for the first few parameters. This makes function calls much faster.

Similarly, Windows' 32-bit "fastcall" calling convention passes multiple arguments to registers. (To use a non-standard calling convention, you need to annotate the function declaration appropriately and do so consistently where it is defined.)



More information on the various calling conventions can be found in the Wikipedia article . AMD64 ABI can be found at x86-64.org (PDF document) . The original V IA-32 ABI (the base ABI used in Linux, xBSD, and OS X) is still available from www.sco.com (PDF document) .


Undefined behavior?

The code presented in the OP is definitely undefined.

  • In a function definition, an empty parameter list means that the function does not take any arguments. In a function declaration, an empty parameter cannot declare how many arguments the function takes.

    & sect; 6.7.6.3/p.14: An empty list in a function declaration that is part of that function's definition indicates that the function has no parameters . An empty list in a function declaration that is not part of that function's definition indicates that there is no information about the number or types of parameters.

  • When the function is eventually called, it must be called with the correct number of parameters:

    & sect; 6.5.2.2/p.6: If the expression denoting the called function is of a type that does not include a prototype, then integer promotions are performed for each argument, and arguments of type float are promoted to double ... If the number of arguments is not equal to the number of parameters, the behavior is undefined.

  • If the function is defined as a vararg function (with a trailing ellipsis), the vararg declaration must be visible wherever the function is called.

    (Continuing the previous quote): If a function is defined with a type that includes a prototype, and either the prototype ends with an ellipsis (, ...) or the argument types after a promotion are incompatible with the parameter types, the behavior is undefined.

+2


source







All Articles