A faster way to multiply floats

We can do left shift operators in C / C ++ for a faster way to multiply integers with powers of 2.

But we cannot use the left shift operators for float or double, because they are represented differently, having an exponent component and a mantissa component.

My questions are that

Is there any way? How are the left shift operators for integers to multiply floating point numbers faster? Even with powers of 2 ??

+2


source to share


5 answers


No, you cannot. But depending on your problem, you can use SIMD instructions to perform a single operation on multiple packed variables. Read about the SSE2 instruction set. http://en.wikipedia.org/wiki/SSE2
http://softpixel.com/~cwright/programming/simd/sse2.php



In any case, if you optimize for floating point multiplication, you end up in 99% of the cases being in the wrong place. Without getting into serious criticism of premature optimization, at least justify it by doing proper profiling.

+13


source


You can do it:

float f = 5.0;
int* i = (int*)&f;
*i += 0x00800000;

      



But then you have the overhead of moving the float out of a register, into memory, and then back to another register, only to be flushed back into memory ... about 15 more loops than if you were just done fmul

. Of course, even if your system has IEEE floats at all.

Don't try to optimize this. You should look at the rest of your program to find algorithmic optimizations, rather than trying to discover ways to micro-optimize things like floats. It will only end in blood and tears.

+5


source


The speed of your floating point operations seems to depend on the combination of commands. Explained here:

What is the relative speed of floating point addition versus floating point multiplication

One alternative is to use fixed point floats instead of "real" floats.

+1


source


Truly, any decent compiler will recognize constant voltage constants twice and use the smartest operation.

+1


source


In Microsoft Visual C ++, don't forget the floating point switch. The default /fp:precise

, but you can change it to /fp:fast

. The fast model handles some floating point precision for faster speed. In some cases, the speedups can be dramatic (in some cases, the blog post below allows speeds up to x5). Please note that Xbox games are compiled with by default /fp:fast

.

I just switched from /fp:precise

to /fp:fast

to my math application (with many multiplications float

) and got an immediate 27% speedup, with almost no loss of precision in my test set.

Read the Microsoft blog post regarding the details of this switch here . It looks like the main reasons for this would not be if you needed all the precision you need (e.g. games with large worlds, lengthy simulations where bugs can accumulate), or you need robust processing double

or float

NaN.

Finally, we will also consider enabling SSE2 instruction extensions. This gave an additional 3% boost in my application. The effects of this will vary depending on the number of operands in your arithmetic, etc. For example, these extensions can provide speedup in cases where you add or multiply more than two numbers at the same time.

0


source







All Articles