How to optimize for constant division?
Optimization for divided by constant is well optimized by gcc as you know :)
Now I'm wondering how constant division is optimized . gcc doesn't help me, nor does clang.
Perhaps I am not good at such information, but I cannot find material on optimization for the separating constant . (In contrast, division by a constant is well introduced.)
#include <stdio.h>
int f(int x)
{
// can I optimize off the idiv opcode here?
return 33659/x;
}
int main()
{
int x;
scanf("%d", &x);
printf("%d", f(x));
return 0;
}
EDIT1:
#include <stdio.h>
#define DIVIDEND 33
void f ( unsigned int* arr, int n )
{
for ( int i = 0; i < n ; i++ )
{
arr[i] = DIVIDEND / arr[i];
}
}
int main()
{
const int n = 1024;
unsigned int buf[n];
for ( int i = 0; i < n; i++ )
{
scanf ( "%u", buf + i );
}
f ( buf, n );
for ( int i = 0; i < n; i++ )
{
printf ( "%d", buf[i] );
}
return 0;
}
Optimized with clang -O3 -march=native div.c -o div
just unrolling for loop, while:
#include <stdio.h>
#define DIVIDEND 33
#define DIVISOR DIVIDEND
void f ( unsigned int* arr, int n )
{
for ( int i = 0; i < n ; i++ )
{
//arr[i] = DIVIDEND / arr[i];
arr[i] = arr[i] / DIVISOR;
}
}
int main()
{
const int n = 1024;
unsigned int buf[n];
for ( int i = 0; i < n; i++ )
{
scanf ( "%u", buf + i );
}
f ( buf, n );
for ( int i = 0; i < n; i++ )
{
printf ( "%d", buf[i] );
}
return 0;
}
using the same command line, you end up with a bunch of terrifying AVX2 code. (Remember that division by a constant is rewritten in shift + mul + add, which can be vectorized!)
EDIT2: Thanks @ user2722968! Applying RCPPS will speed up program execution.
Here is my experimental implementation using RCPPS for fast constant divisor division:
https://github.com/ThinerDAS/didactic-spoon/blob/master/div.c
However, I'm not sure how to make it more accurate without a lot of overhead.
source to share