Advantages and Disadvantages of Floating Point and Fixed Point Representations

I have been trying for the past three days to understand the exact differences between floating point and fixed point representations. I am confused reading the material and I cannot decide what is right and what is wrong.

One problem is related to the meaning of several technical terms such as precision, mantissa, denormalized, lower, etc.

Can anyone explain the differences with examples?

The points I have been able to learn so far (and understand clearly) are as follows: -

Floating point -
1. Advantage Provides a very large range
2. Disadvantage Removes large numbers

Fixed point -
1. Advantage Numbers are presented accurately (used when "money" is involved)
2. Disadvantage Provide very limited range

But I know there are a lot more differences (mainly advantages and disadvantages). Can anyone list them with explanations?


source to share

4 answers

Floating point techniques take a long time to get used to. I won't go into details here.

Simply put, floating points reach a high region (from very small numbers, close to zero, to very large numbers, sometimes even higher than the number of atoms in the universe). Floating points achieve this by keeping a constant constant relative... That is, the number will start to round after a fixed number of decimal places (this is a simplification, but helps to understand the principle). This is very similar to the concept of "significant numbers" in most natural sciences. However, this means that floating point numbers are always rounded. If you add a very small number to a very large number, the small number will simply be truncated and the large number will remain. This happens when a small number is below the threshold. If you add a lot of numbers, sometimes you may need to sort them first and add small to large. There is also the concept of numerical robustness, such as how an algorithm will deviate from the correct result due to rounding.

On the other hand, a fixed point representation will always have the same absolute error. If you store currency with four decimal places, you know that your data will be disabled by a maximum of 0.00005 cents. However, if you add your data, this error can accumulate again, but the rules for this are very different from those for floating points.

Unless you are doing large numbers, these issues probably should not be considered. In most cases, floating point and fixed point numbers work very well when taken care of (i.e., never use==

for floating point or fixed point numbers. However, the correct way to compare them is different for both). Also, AFAIK tooltips are more often used in scientific work, because most often scientists will be trained in numerical calculations, know how to deal with rounding, and they are only interested in relatively accurate results. Fixed points are used in finance, where every rounding has to be accounted for and stored somewhere (often banks will just keep the rounded half microcrypts), so you must have very good absolute error control so that you can account for it later.



Floating point numbers are good for, well, floating points i.e. when you need to express numbers at different scales. You sacrifice precision to get the scale range.

On the other hand, fixed point numbers only work at a fixed scale (and will be overwhelmed or invalid if you scale them too much), but you get precision as long as you stay within the desired scale.

In short: if you are multiplying multiple times but not adding numbers on different scales, use floating points. If you add a lot but don't multiply, use fixed points.

(A good example of fixed point use is anything to do with currency: essentially, you can fix your unit as cents or one-hundredth of a percent and make all your monetary values ​​whole numbers in that block.)



Fixed point numbers can be sorted in linear time. The fixed point is also unambiguous; each numeric value that can be expressed in a particular fixed point protocol has only one representation, which is not floating point.

The floating point has a much wider representable range. It is also ambiguous. Floating point numbers can be sorted by NlogN time.



Fixed point is an integer representation of a floating point number. Thus, operations can be applied to a number as well as to integers. The advantage of using this is that floating point arithmetic is more expensive (computational power). Newer processors have dedicated FPUs (floating point units) to handle this.

This kind of fixed-point arithmetic is when processing power is limited and a small loss of precision doesn't cause chaos.



All Articles