The C compiler optimizes the loop by running it
Can the C compiler ever optimize the loop by running it?
For example:
int num[] = {1, 2, 3, 4, 5}, i;
for(i = 0; i < sizeof(num)/sizeof(num[0]); i++) {
if(num[i] > 6) {
printf("Error in data\n");
exit(1);
}
}
Instead of running this every time the program is executed, can the compiler just run this and optimize it?
source to share
Let's see ... (This is the only way to tell.)
Fist, I turned your snippet into something that we can try to compile and run and save in a file called main.c
.
#include <stdio.h>
static int
f()
{
const int num[] = {1, 2, 3, 4, 5};
int i;
for (i = 0; i < sizeof(num) / sizeof(num[0]); i++)
{
if (num[i] > 6)
{
printf("Error in data\n");
return 1;
}
}
return 0;
}
int
main()
{
return f();
}
Running gcc -S -O3 main.c
creates the following assembly file (in main.s
).
.file "main.c"
.section .text.unlikely,"ax",@progbits
.LCOLDB0:
.section .text.startup,"ax",@progbits
.LHOTB0:
.p2align 4,,15
.globl main
.type main, @function
main:
.LFB22:
.cfi_startproc
xorl %eax, %eax
ret
.cfi_endproc
.LFE22:
.size main, .-main
.section .text.unlikely
.LCOLDE0:
.section .text.startup
.LHOTE0:
.ident "GCC: (GNU) 5.1.0"
.section .note.GNU-stack,"",@progbits
Even if you don't know the assemblies, you will notice that the line is "Error in data\n"
missing from the file, so apparently there must have been some optimization.
If we look closer at the machine instructions generated for the function main
,
xorl %eax, %eax ret
We can see that all it does is XOR'ing the EAX register with itself (which always results in zero) and writing that value to EAX. Then he comes back again. The EAX register is used to store the return value. As we can see, the function has f
been fully optimized.
source to share
Compilers can do even better. Not only can compilers test the effect of running code forward, but the standard even allows them to work with code logic in reverse in situations involving potential Undefined Behavior. For example given:
#include <stdio.h>
int main(void)
{
int ch = getchar();
int q;
if (ch == 'Z')
q=5;
printf("You typed %c and the magic value is %d", ch, q);
return 0;
}
the compiler will have the right to assume that the program will never receive any input, which printf
is why it will reach without q
getting a value; since the only input character it will call q
to get the value will 'Z'
, so the compiler can legitimately replace the code with:
int main(void)
{
getchar();
printf("You typed Z and the magic value is 5");
}
If the user types Z
, the behavior of the original program will be correctly defined and the behavior of the latter will match it. If the user enters anything else, the original program will call Undefined Behavior and, as a result, the Standard will not impose any requirements on what the compiler can do. The compiler will be free to do whatever it likes, including producing the same output that would be generated by typing Z
.
source to share