Why does using char agrv instead of char ** argv as the root cause argument have the following output?
When I do this:
int main(int agrc, char argv)
{
printf("%d", argv);
return 0;
}
I get this input when I run the program from the command line:
$ prog_name 0
0
$ prog_name (from 0-7 characters)
48
$ prog_name 12345678
56
$ prog_name 1234567812345678
64
// and so on...
So where do these values ββcome from and why do they increase by 8?
What happens when I have this:
int main(int agrc, char argv[])
?
source to share
Your result will most likely be the address of a "normal" parameter argv
, that is, an implicitly converted interpretedsee comment belowhow char
. In other words, I suspect that what you have is equivalent to:
int main(int agrc, char **argv)
{
printf("%d", (char) argv);
return 0;
}
On my machine (CentOS 6 32-bit) the parsed object codes are as follows:
0x080483c4 <+0>: push %ebp
0x080483c5 <+1>: mov %esp,%ebp
0x080483c7 <+3>: and $0xfffffff0,%esp
0x080483ca <+6>: sub $0x10,%esp
0x080483cd <+9>: mov 0xc(%ebp),%eax
0x080483d0 <+12>: movsbl %al,%eax
0x080483d3 <+15>: mov %eax,0x4(%esp)
0x080483d7 <+19>: movl $0x80484b4,(%esp)
0x080483de <+26>: call 0x80482f4 <printf@plt>
and the source code you posted:
0x080483c4 <+0>: push %ebp
0x080483c5 <+1>: mov %esp,%ebp
0x080483c7 <+3>: and $0xfffffff0,%esp
0x080483ca <+6>: sub $0x20,%esp
0x080483cd <+9>: mov 0xc(%ebp),%eax
0x080483d0 <+12>: mov %al,0x1c(%esp)
0x080483d4 <+16>: movsbl 0x1c(%esp),%eax
0x080483d9 <+21>: mov %eax,0x4(%esp)
0x080483dd <+25>: movl $0x80484b4,(%esp)
0x080483e4 <+32>: call 0x80482f4 <printf@plt>
In both cases it $0x80484b4
stores the format specifier "%d"
as a string literal and 0xc(%ebp)
is responsible for the actual value that is used printf()
:
(gdb) x/db 0xbffff324
0xbffff324: -60
(gdb) p $al
$3 = -60
Note that AL
(one byte accumulator, ie part EAX
) only "fetches" the first byte (my CPU is a little noun, so it is actually LSB) at the address $ebp+0xc
. This means that the conversion (char)
is "truncating" the address argv
.
As a consequence, you may notice that each of these numbers has log2(n)
least significant bits. This is due to the need to align pointer-type objects. Usually for a 32 bit x86 machine alignof(char **) == 4
.
As already pointed out in the comments, you have violated the C standard, so this is a UB example.
source to share
From the C standards regarding signature main()
The implementation does not declare a prototype for this function.
This way, there will be no problem with the compiler if you pass different types of arguments.
In your code
int main(int agrc, char argv)
is not a recommended signature for main()
. It must be either
int main(int agrc, char* argv[])
or at least
int main(int agrc, char** argv)
Otherwise, the behavior is undefined in the hosted environment. More about this can be found in the standard C11
, chapter 5.1.2.2.1.
In your case, as you can see, you are doing the second parameter a char
. According to standard specification
If the value is
argc
greater than zero, the elements of the arrayargv[0]
,argv[argc-1]
inclusively, must contain pointers to strings, ....
So here the supplied one 0
is passed in main()
as a pointer to the string that is being received in char
, which is not a defined behavior.
source to share
There is a pointer to a string on the stack, but you declared main
with char and then printed it as decimal. The memory address of this line is not predictable, so you get unpredictable output.
Try the following:
int main( int argc, char* argv[] )
{
printf( "%s", argv[1] );
return 0;
}
I think this will give you what you intended.
source to share