Detect loss of precision in pure C ++

Is it possible to detect loss of accuracy when working with floating point numbers (type float

, double

, long double

)? Say:

template< typename F >
F const sum(F const & a, F const & b)
{
    F const sum_(a + b);
    // The loss of precision must be detected (here or one line above) if some fraction bit is lost due to rounding
    return sum_;
}

      

It is especially interesting in the case when the x87 FPU is present in the target architecture, but without subroutine intervention asm

in pure C ++ code. C ++ 11 or gnu ++ 11 special functions are also accepted if present.

+3


source to share


4 answers


The C ++ standard is very vague about the concept of floating point precision. There is no completely standardized way to detect accurate loss.



GNU provides an extension to include floating point exceptions. The exception you want to capture is FE_INEXACT

.

+4


source


One thing that will help you is that std::numeric_limits<double>::epsilon

which returns "the difference between 1 and the smallest value greater than 1 that is representable". In other words, it tells you the largest x

> 0, which 1+x

evaluates to 1.



+1


source


You might consider using spacing arithmetic in the boost library. It can guarantee the property that the error interval for always increases in the calculation: ∀ x ∈[a,b], f(x) ∈ f([a,b])

.

In your case, you can use the starting range [a-EPS,a+EPS]

for the original number a

. After a series of operations, there will be a (maximum) loss of precision abs(y-x)

for the resulting interval [x,y]

you want to know.

+1


source


You can use something like the following:

#include <iostream>
#include <fenv.h>

#pragma STDC FENV_ACCESS ON

template <typename F> F sum (const F a, const F b, F &error) {
    int round = fegetround();

    fesetround(FE_TONEAREST);
    F c = a + b;

    fesetround(FE_DOWNWARD);
    F c_lo = a + b;

    fesetround(FE_UPWARD);
    F c_hi = a + b;

    fesetround(FE_TONEAREST);
    error = std::max((c - c_lo), (c_hi - c));

    fesetround(round);

    return c;
}


int main() {
    float a = 23.23528;
    float b = 4.234;
    float e;

    std::cout << sum(a, b, e) << std::endl;
    std::cout << e << std::endl;
}

      

A quick estimate of the maximum error amount is returned in the argument error

. Be aware that enabling rounding mode flushes the floating point pipeline (FPU), so don't expect fast speeds.

A better solution would be to try interval arithmetic (tends to give pessimistic error intervals since correlation variables are not accounted for) or Affine arithmetic (tracks correlation variables and therefore gives slightly tighter error bounds).

Read the primer in these methods here: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.8089

+1


source







All Articles